-
Notifications
You must be signed in to change notification settings - Fork 2.4k
[2/N][rollout] feat: support vllm/sglang DP+EP in server mode #3530
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[2/N][rollout] feat: support vllm/sglang DP+EP in server mode #3530
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request adds support for Data Parallelism (DP) and Expert Parallelism (EP) in vLLM/SGLang server mode. The changes are extensive, touching configuration, testing, and core distributed logic. My review has identified several critical issues, including a syntax error in a shell script, a broken f-string in an assertion, and incorrect usage of uvicorn that would prevent the server from starting. I've also pointed out some areas for improvement in test portability and configuration validation. Please address these critical issues to ensure the new functionality works as expected.
884392a to
d11e124
Compare
|
For vllm, sleep level=2 does not work with expert parallel, fixed in vllm-project/vllm#25458. As a workaround, when EP enabled, sleep(level=1) for now. |
|
For sglang, there's same issue which is already fixed in sgl-project/sglang#8676, we should upgrade sglang to 0.5.2. |
### What does this PR do? Solve #3530 (comment)
d11e124 to
14dd442
Compare
| from vllm.outputs import RequestOutput | ||
| from vllm.utils import FlexibleArgumentParser | ||
| from vllm.usage.usage_lib import UsageContext | ||
| from vllm.utils import FlexibleArgumentParser, get_tcp_uri |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
get_tcp_uri is introduced in vllm v0.9.0, the installation script needs to be updated.
### What does this PR do? Solve volcengine#3530 (comment)
…gine#3530) ### What does this PR do? Following volcengine#3456, support vllm/sglang DP+EP in server mode.
### What does this PR do? Solve volcengine#3530 (comment)
…gine#3530) ### What does this PR do? Following volcengine#3456, support vllm/sglang DP+EP in server mode.
### What does this PR do? Solve volcengine#3530 (comment)
…gine#3530) ### What does this PR do? Following volcengine#3456, support vllm/sglang DP+EP in server mode.
### What does this PR do? Solve volcengine#3530 (comment)
…gine#3530) ### What does this PR do? Following volcengine#3456, support vllm/sglang DP+EP in server mode.
What does this PR do?
Following #3456, support vllm/sglang DP+EP in server mode.