Skip to content

Conversation

@rainyfly
Copy link
Collaborator

@rainyfly rainyfly commented Nov 5, 2025

Motivation

  1. Fix dp rank for multiple node.
  2. D performs decode in first step.
  3. Optimize engine worker queue performance for mm data.

@paddle-bot
Copy link

paddle-bot bot commented Nov 5, 2025

Thanks for your contribution!

has_prefill_task = True
if (
self.fd_config.scheduler_config.splitwise_role == "decode"
): # In PD, we continue to decode after P generate first token
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

has_prefill_task = False?

Copy link
Collaborator Author

@rainyfly rainyfly Nov 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

gpu_model_runner.py 里这个变量只用于控制当前 step 下的 need_not_stop,不用改动

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants