Skip to content

Conversation

@david6666666
Copy link
Contributor

@david6666666 david6666666 commented Nov 25, 2025

Purpose

#22179 comment Add Async Eplb nightly CI

  • DeepSeek V2-Lite Async EPLB Accuracy
  • Qwen3-Next-80B-A3B-Instruct MTP Async EPLB Accuracy

Test Plan

bash .buildkite/scripts/scheduled_integration_test/deepseek_v2_lite_ep_async_eplb.sh 0.25 1319 8030
bash .buildkite/scripts/scheduled_integration_test/qwen3_next_mtp_async_eplb.sh 0.8 1319 8040

Test Result

/tmp/vllm-scheduled/deepseek-ai_DeepSeek-V2-lite_deepep_high_throughput_async_eplb.json
{
  "accuracy": 0.3601213040181956,
  "invalid_rate": 0.006065200909780136,
  "latency": 85.7941705584526,
  "questions_per_second": 15.37400491681832,
  "total_output_tokens": 151616,
  "tokens_per_second": 1767.206314987359,
  "num_questions": 1319,
  "num_shots": 5,
  "max_tokens": 256,
  "timestamp": 1764056573.579663
}
/tmp/vllm-scheduled/deepseek-ai_DeepSeek-V2-lite_deepep_low_latency_async_eplb.json
{
  "accuracy": 0.3646702047005307,
  "invalid_rate": 0.006065200909780136,
  "latency": 73.81600442156196,
  "questions_per_second": 17.86875367118509,
  "total_output_tokens": 152375,
  "tokens_per_second": 2064.2542385495285,
  "num_questions": 1319,
  "num_shots": 5,
  "max_tokens": 256,
  "timestamp": 1764056759.8287513
}
/tmp/vllm-scheduled/Qwen__Qwe_3-Next-80B-A3B-I_struct_deepep_high_throughput.json
{
  "accuracy": 0.8658074298711145,
  "invalid_rate": 0.0,
  "latency": 132.54091829434037,
  "questions_per_second": 9.951643741224348,
  "total_output_tokens": 209727,
  "tokens_per_second": 1582.356623893676,
  "num_questions": 1319,
  "num_shots": 5,
  "max_tokens": 256,
  "timestamp": 1764058291.3716583
}
/tmp/vllm-scheduled/Qwen__Qwe_3-Next-80B-A3B-I_struct_deepep_low_latency.json
{
  "accuracy": 0.8605003790750568,
  "invalid_rate": 0.0,
  "latency": 132.2467623502016,
  "questions_per_second": 9.973779142563554,
  "total_output_tokens": 209355,
  "tokens_per_second": 1583.0633300920342,
  "num_questions": 1319,
  "num_shots": 5,
  "max_tokens": 256,
  "timestamp": 1764058529.979115
}

Essential Elements of an Effective PR Description Checklist
  • The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
  • The test plan, such as providing test command.
  • The test results, such as pasting the results comparison before and after, or e2e results
  • (Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
  • (Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

@mergify mergify bot added the ci/build label Nov 25, 2025
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces nightly CI tests for asynchronous EPLB for the DeepSeek-V2-Lite and Qwen3-Next models. The changes are well-structured, adding new test scripts and updating the Buildkite pipeline. However, I've identified a critical issue in the new test scripts: the use_async flag in the EPLB configuration is incorrectly passed as a string ("true") instead of a boolean (true). This will prevent the async feature from being enabled, meaning the tests would not cover the intended functionality. I've provided suggestions to correct this in the review comments.

--data-parallel-size 2 \
--enable-expert-parallel \
--enable-eplb \
--eplb-config '{"window_size":200,"step_interval":600,"use_async":"true"}' \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

The use_async parameter in the JSON for --eplb-config is specified as the string "true". For it to be correctly parsed as a boolean value by Pydantic, it should be the JSON boolean literal true (without quotes). With the current value, the async feature will not be enabled, which defeats the purpose of this test.

Suggested change
--eplb-config '{"window_size":200,"step_interval":600,"use_async":"true"}' \
--eplb-config '{"window_size":200,"step_interval":600,"use_async":true}' \

--tensor-parallel-size 4 \
--enable-expert-parallel \
--enable-eplb \
--eplb-config '{"window_size":200,"step_interval":600,"use_async":"true"}' \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

The use_async parameter in the JSON for --eplb-config is specified as the string "true". For it to be correctly parsed as a boolean value by Pydantic, it should be the JSON boolean literal true (without quotes). With the current value, the async feature will not be enabled, which defeats the purpose of this test.

Suggested change
--eplb-config '{"window_size":200,"step_interval":600,"use_async":"true"}' \
--eplb-config '{"window_size":200,"step_interval":600,"use_async":true}' \

@david6666666
Copy link
Contributor Author

david6666666 commented Nov 25, 2025

@tlrmchlsmth @DarkLight1337 PTAL, thx

Signed-off-by: David Chen <[email protected]>
@david6666666 david6666666 changed the title [CI] Add Async Eplb nightly CI [CI] Add Async Eplb nightly CI tests Nov 25, 2025
Signed-off-by: David Chen <[email protected]>
@DarkLight1337 DarkLight1337 added the ready ONLY add when PR is ready to merge/full CI is needed label Nov 27, 2025
@david6666666
Copy link
Contributor Author

Ci failed test_eagle_correctness is not related about this pr.
2 cases in this pr paased, @DarkLight1337 PTAL, thx

@DarkLight1337
Copy link
Member

Merging, sorry for the delay!

@david6666666
Copy link
Contributor Author

Merging, sorry for the delay!

Merging, sorry for the delay!

Waiting to be merged

@DarkLight1337
Copy link
Member

DarkLight1337 commented Nov 29, 2025

I don't have permission to force-merge this PR, need to wait for @simon-mo @WoosukKwon @youkaichao

@david6666666
Copy link
Contributor Author

@DarkLight1337 CI passed, PTAL, thx.

@DarkLight1337 DarkLight1337 enabled auto-merge (squash) December 2, 2025 07:32
Copy link
Member

@tlrmchlsmth tlrmchlsmth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for adding the test!

--data-parallel-size 2 \
--enable-expert-parallel \
--enable-eplb \
--eplb-config '{"window_size":200,"step_interval":600}' \
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice catch, not sure the default step interval actually triggering eplb here

@DarkLight1337 DarkLight1337 merged commit 7fe9c1a into vllm-project:main Dec 3, 2025
52 checks passed
LucasWilkinson added a commit that referenced this pull request Dec 4, 2025
charlotte12l pushed a commit to charlotte12l/vllm that referenced this pull request Dec 5, 2025
Signed-off-by: David Chen <[email protected]>
Signed-off-by: WeiQing Chen <[email protected]>
Co-authored-by: Cyrus Leung <[email protected]>
Signed-off-by: Xingyu Liu <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ci/build ready ONLY add when PR is ready to merge/full CI is needed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants