-
-
Notifications
You must be signed in to change notification settings - Fork 11.8k
[Optimization] Make new_block_ids None if empty #23262
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Signed-off-by: Woosuk Kwon <[email protected]>
|
👋 Hi! Thank you for contributing to the vLLM project. 💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels. Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging. To run CI, PR reviewers can either: Add 🚀 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request introduces an optimization to avoid serializing empty block ID lists for decode requests, which occurs frequently. This is achieved by returning None instead of an empty list structure. The changes are consistently applied across the scheduler and worker components. The type hints have been updated to reflect the optional return value, and consumers of this data now correctly handle the None case. The refactoring to delay block ID generation is a clean approach. The code quality is high, and the changes appear correct and safe.
heheda12345
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
I don't think this optimization is necessary. And FYI, we now support mamba by setting |
…ne if empty (#93) Culprit commit: vllm-project/vllm#23262 --------- Signed-off-by: Agata Dobrzyniewicz <[email protected]>
Signed-off-by: Woosuk Kwon <[email protected]> Signed-off-by: Duncan Moss <[email protected]>
### What this PR does / why we need it? 1. use action/checkout@v5 instead of v4 2. remove dbo test case because there is issue with it and will be refactored later 3. make vllm-ascend compatible with vllm v0.10.1.1 and add CI for it 4. fix sampler api changes introduced by vllm-project/vllm#22387 6. fix qwen3 moe config changes intruoduced by vllm-project/vllm#20562 7. fix kvcache block changes introduced by vllm-project/vllm#23262 ### Does this PR introduce _any_ user-facing change? N/A ### How was this patch tested? CI passed with existing test. - vLLM version: v0.10.0 - vLLM main: vllm-project/vllm@0c6e40b --------- Signed-off-by: MengqingCao <[email protected]>
…ne if empty (#93) Culprit commit: vllm-project/vllm#23262 --------- Signed-off-by: Agata Dobrzyniewicz <[email protected]> Signed-off-by: Marcin Swiniarski <[email protected]>
Signed-off-by: Woosuk Kwon <[email protected]>
Signed-off-by: Woosuk Kwon <[email protected]> Signed-off-by: Xiao Yu <[email protected]>
Signed-off-by: Woosuk Kwon <[email protected]>
Signed-off-by: Woosuk Kwon <[email protected]>
Signed-off-by: Woosuk Kwon <[email protected]>
### What this PR does / why we need it? 1. use action/checkout@v5 instead of v4 2. remove dbo test case because there is issue with it and will be refactored later 3. make vllm-ascend compatible with vllm v0.10.1.1 and add CI for it 4. fix sampler api changes introduced by vllm-project/vllm#22387 6. fix qwen3 moe config changes intruoduced by vllm-project/vllm#20562 7. fix kvcache block changes introduced by vllm-project/vllm#23262 ### Does this PR introduce _any_ user-facing change? N/A ### How was this patch tested? CI passed with existing test. - vLLM version: v0.10.0 - vLLM main: vllm-project/vllm@0c6e40b --------- Signed-off-by: MengqingCao <[email protected]>
Signed-off-by: Woosuk Kwon <[email protected]>
### What this PR does / why we need it? 1. use action/checkout@v5 instead of v4 2. remove dbo test case because there is issue with it and will be refactored later 3. make vllm-ascend compatible with vllm v0.10.1.1 and add CI for it 4. fix sampler api changes introduced by vllm-project/vllm#22387 6. fix qwen3 moe config changes intruoduced by vllm-project/vllm#20562 7. fix kvcache block changes introduced by vllm-project/vllm#23262 ### Does this PR introduce _any_ user-facing change? N/A ### How was this patch tested? CI passed with existing test. - vLLM version: v0.10.0 - vLLM main: vllm-project/vllm@0c6e40b --------- Signed-off-by: MengqingCao <[email protected]>
### What this PR does / why we need it? 1. use action/checkout@v5 instead of v4 2. remove dbo test case because there is issue with it and will be refactored later 3. make vllm-ascend compatible with vllm v0.10.1.1 and add CI for it 4. fix sampler api changes introduced by vllm-project/vllm#22387 6. fix qwen3 moe config changes intruoduced by vllm-project/vllm#20562 7. fix kvcache block changes introduced by vllm-project/vllm#23262 ### Does this PR introduce _any_ user-facing change? N/A ### How was this patch tested? CI passed with existing test. - vLLM version: v0.10.0 - vLLM main: vllm-project/vllm@0c6e40b --------- Signed-off-by: MengqingCao <[email protected]>
Since our page size is 16,
new_block_idsfor decode requests is empty for every 15 out of 16 steps.However, currently we serialize the empty lists every step.
This PR fixes this unnecessary overheads by making them
Nonebefore serialization.TODO: When
block_sizeis different across groups, we should changelist[Optional[tuple[list[int], ...]]intolist[Optional[tuple[Optional[list[int], ....]]]].