feat!: support v0.11.1 #112

ILikeIneine · 2025-10-27T07:52:41Z

Purpose

This PR is for supporting vllm v0.11.1

Test Plan

Test Result

(Optional) Documentation Update

Essential Elements of an Effective PR Description Checklist

The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
The test plan, such as providing test command.
The test results, such as pasting the results comparison before and after, or e2e results
(Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.

gemini-code-assist

Code Review

This pull request updates the codebase to support vllm v0.11.1, which involves significant refactoring around memory allocation, platform integration, and attention mechanisms. The changes appear to align with the goal of supporting the new vllm version. I have found one critical issue in the device allocator patch that could lead to a runtime error and have provided a fix.

gemini-code-assist · 2025-10-27T07:54:44Z

vllm_metax/patch/device_allocator/device_allocator.py

+    if len(self._sleep_saved_buffers):
+        model = self.model_runner.model
+        for name, buffer in model.named_buffers():
+            if name in self._sleep_saved_buffers:
+                buffer.data.copy_(self._sleep_saved_buffers[name].data)
+        self._sleep_saved_buffers = {}


There is a potential AttributeError here. The self._sleep_saved_buffers attribute is only initialized within the sleep method, and only when level == 2. If wake_up is called after sleep(level=1) or before any call to sleep, self._sleep_saved_buffers will not exist on the object, causing a crash when len() is called on it.

To prevent this, you should safely check for the attribute's existence before trying to access it.

Suggested change

if len(self._sleep_saved_buffers):

model = self.model_runner.model

for name, buffer in model.named_buffers():

if name in self._sleep_saved_buffers:

buffer.data.copy_(self._sleep_saved_buffers[name].data)

self._sleep_saved_buffers = {}

if hasattr(self, "_sleep_saved_buffers") and self._sleep_saved_buffers:

model = self.model_runner.model

for name, buffer in model.named_buffers():

if name in self._sleep_saved_buffers:

buffer.data.copy_(self._sleep_saved_buffers[name].data)

self._sleep_saved_buffers = {}

Signed-off-by: Hank <[email protected]>

Signed-off-by: Xin Li <[email protected]>

Signed-off-by: leex404 <[email protected]>

…l` (#115) * [fix] fix sample_recovered_tokens_kernel use too much private memory Signed-off-by: Xin Li <[email protected]> * [fix] fix type error in bf16_paged_mqa_logits Signed-off-by: Xin Li <[email protected]> * [chore] change file directory Signed-off-by: Xin Li <[email protected]> --------- Signed-off-by: Xin Li <[email protected]> Co-authored-by: Xin Li <[email protected]> Signed-off-by: leex404 <[email protected]>

Signed-off-by: leex404 <[email protected]>

Signed-off-by: Hank <[email protected]>

Signed-off-by: leex404 <[email protected]>

Signed-off-by: Hank <[email protected]>

related: vllm-project/vllm/pull/27322 Signed-off-by: Hank <[email protected]>

Signed-off-by: leex404 <[email protected]>

Signed-off-by: Hank <[email protected]>

Signed-off-by: leex404 <[email protected]>

Signed-off-by: Hank <[email protected]>

Signed-off-by: leex404 <[email protected]>

ILikeIneine self-assigned this Oct 27, 2025

ILikeIneine marked this pull request as draft October 27, 2025 07:52

gemini-code-assist bot reviewed Oct 27, 2025

View reviewed changes

ILikeIneine force-pushed the support-vllm-0.11.1 branch from a601543 to 1f48880 Compare October 29, 2025 03:24

leex404 force-pushed the support-vllm-0.11.1 branch 2 times, most recently from ab31312 to f516af8 Compare November 4, 2025 08:35

ILikeIneine and others added 18 commits November 7, 2025 14:09

support platform and remove kernel copy

856b728

Signed-off-by: Hank <[email protected]>

update pre-commit

a52233f

Signed-off-by: Hank <[email protected]>

update version and requirements

9e86f3e

Signed-off-by: Hank <[email protected]>

update flashinfer

6b0b8e6

Signed-off-by: Hank <[email protected]>

update build requirements

9ec7b23

Signed-off-by: Hank <[email protected]>

update attention backends

ec972a6

Signed-off-by: Hank <[email protected]>

update patch

633ff80

Signed-off-by: Hank <[email protected]>

update quant_method

19c876b

Signed-off-by: Hank <[email protected]>

update fuse_moe (todo: fix mypy)

53017fa

Signed-off-by: Hank <[email protected]>

update deepseek_v2.py(todo: fix indexer kernel)

2a3936c

Signed-off-by: Hank <[email protected]>

[feat] support bf16 cp_gather_indexer_k_cache kernel

fbf5235

Signed-off-by: Xin Li <[email protected]>

[fix] fix type error in bf16_paged_mqa_logits

13a6e97

Signed-off-by: leex404 <[email protected]>

[feat] add topk logits ops

7dc236d

Signed-off-by: leex404 <[email protected]>

[fix] fix missing topk logits custom ops definition

1d9c4d4

Signed-off-by: leex404 <[email protected]>

[fix] add custom gptq_shuffle ops

0a459f2

Signed-off-by: leex404 <[email protected]>

[fix] fix compile error

3a2cfb0

Signed-off-by: leex404 <[email protected]>

platform config update

32d2d83

Signed-off-by: Hank <[email protected]>

ILikeIneine force-pushed the support-vllm-0.11.1 branch from de238f9 to 32d2d83 Compare November 7, 2025 06:11

ILikeIneine and others added 5 commits November 7, 2025 18:50

update qwen2.5_vl model

34c03c6

Signed-off-by: Hank <[email protected]>

[fix] fix torch not found maca device

c9bd90a

Signed-off-by: leex404 <[email protected]>

remove hotfixes patch for torch2.8

47baaef

Signed-off-by: Hank <[email protected]>

remove needless patch

bbcc778

related: vllm-project/vllm/pull/27322 Signed-off-by: Hank <[email protected]>

[feat] topk_softmax support renormalize and bf16

6ecac1e

Signed-off-by: leex404 <[email protected]>

leex404 and others added 9 commits November 12, 2025 10:58

[fix] update fused_moe to fit v0.11.1

5317d66

Signed-off-by: leex404 <[email protected]>

[fix] fix fused moe config log missing

b870702

Signed-off-by: leex404 <[email protected]>

use flash_attn as vit attn backend on qwen_vl

dc0fad9

Signed-off-by: Hank <[email protected]>

update quant_conf registry

678dd1a

Signed-off-by: Hank <[email protected]>

fix and apply latest pre-commit of v0.11.1

e6ddd33

Signed-off-by: Hank <[email protected]>

[feat] Keep all AITER kernels in _aiter_ops

be4945a

Signed-off-by: leex404 <[email protected]>

fix pre-commit on type casting

a6d8b1f

Signed-off-by: Hank <[email protected]>

[fix] fix DeepSeek import error

0879783

Signed-off-by: leex404 <[email protected]>

[feat] update deepseek_v2 to fit v0.11.1

b557d27

Signed-off-by: leex404 <[email protected]>

ILikeIneine marked this pull request as ready for review November 14, 2025 02:09

ILikeIneine merged commit 0a392da into master Nov 14, 2025
2 of 4 checks passed

ILikeIneine changed the title ~~[WIP] support v0.11.1~~ feat!: support v0.11.1 Nov 14, 2025

ILikeIneine added the v0.11.1 label Nov 14, 2025

ILikeIneine deleted the support-vllm-0.11.1 branch November 24, 2025 07:26

ILikeIneine restored the support-vllm-0.11.1 branch November 24, 2025 07:26

RedWhiteCATT mentioned this pull request Nov 28, 2025

[Bug]: mineru模型报错 #155

Open

ILikeIneine deleted the support-vllm-0.11.1 branch December 1, 2025 10:32

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat!: support v0.11.1 #112

feat!: support v0.11.1 #112

Uh oh!

ILikeIneine commented Oct 27, 2025

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Oct 27, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

feat!: support v0.11.1 #112

feat!: support v0.11.1 #112

Uh oh!

Conversation

ILikeIneine commented Oct 27, 2025

Purpose

Test Plan

Test Result

(Optional) Documentation Update

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Oct 27, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants