-
Notifications
You must be signed in to change notification settings - Fork 38
feat!: support v0.11.1 #112
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request updates the codebase to support vllm v0.11.1, which involves significant refactoring around memory allocation, platform integration, and attention mechanisms. The changes appear to align with the goal of supporting the new vllm version. I have found one critical issue in the device allocator patch that could lead to a runtime error and have provided a fix.
| if len(self._sleep_saved_buffers): | ||
| model = self.model_runner.model | ||
| for name, buffer in model.named_buffers(): | ||
| if name in self._sleep_saved_buffers: | ||
| buffer.data.copy_(self._sleep_saved_buffers[name].data) | ||
| self._sleep_saved_buffers = {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a potential AttributeError here. The self._sleep_saved_buffers attribute is only initialized within the sleep method, and only when level == 2. If wake_up is called after sleep(level=1) or before any call to sleep, self._sleep_saved_buffers will not exist on the object, causing a crash when len() is called on it.
To prevent this, you should safely check for the attribute's existence before trying to access it.
| if len(self._sleep_saved_buffers): | |
| model = self.model_runner.model | |
| for name, buffer in model.named_buffers(): | |
| if name in self._sleep_saved_buffers: | |
| buffer.data.copy_(self._sleep_saved_buffers[name].data) | |
| self._sleep_saved_buffers = {} | |
| if hasattr(self, "_sleep_saved_buffers") and self._sleep_saved_buffers: | |
| model = self.model_runner.model | |
| for name, buffer in model.named_buffers(): | |
| if name in self._sleep_saved_buffers: | |
| buffer.data.copy_(self._sleep_saved_buffers[name].data) | |
| self._sleep_saved_buffers = {} |
a601543 to
1f48880
Compare
ab31312 to
f516af8
Compare
Signed-off-by: Hank <[email protected]>
Signed-off-by: Hank <[email protected]>
Signed-off-by: Hank <[email protected]>
Signed-off-by: Hank <[email protected]>
Signed-off-by: Hank <[email protected]>
Signed-off-by: Hank <[email protected]>
Signed-off-by: Hank <[email protected]>
Signed-off-by: Hank <[email protected]>
Signed-off-by: Hank <[email protected]>
Signed-off-by: Hank <[email protected]>
Signed-off-by: Xin Li <[email protected]>
Signed-off-by: leex404 <[email protected]>
Signed-off-by: leex404 <[email protected]>
…l` (#115) * [fix] fix sample_recovered_tokens_kernel use too much private memory Signed-off-by: Xin Li <[email protected]> * [fix] fix type error in bf16_paged_mqa_logits Signed-off-by: Xin Li <[email protected]> * [chore] change file directory Signed-off-by: Xin Li <[email protected]> --------- Signed-off-by: Xin Li <[email protected]> Co-authored-by: Xin Li <[email protected]> Signed-off-by: leex404 <[email protected]>
Signed-off-by: leex404 <[email protected]>
Signed-off-by: leex404 <[email protected]>
Signed-off-by: leex404 <[email protected]>
Signed-off-by: Hank <[email protected]>
de238f9 to
32d2d83
Compare
Signed-off-by: Hank <[email protected]>
Signed-off-by: leex404 <[email protected]>
Signed-off-by: Hank <[email protected]>
related: vllm-project/vllm/pull/27322 Signed-off-by: Hank <[email protected]>
Signed-off-by: leex404 <[email protected]>
Signed-off-by: leex404 <[email protected]>
Signed-off-by: leex404 <[email protected]>
Signed-off-by: Hank <[email protected]>
Signed-off-by: Hank <[email protected]>
Signed-off-by: Hank <[email protected]>
Signed-off-by: leex404 <[email protected]>
Signed-off-by: Hank <[email protected]>
Signed-off-by: leex404 <[email protected]>
Signed-off-by: leex404 <[email protected]>
Purpose
This PR is for supporting vllm v0.11.1
Test Plan
Test Result
(Optional) Documentation Update
Essential Elements of an Effective PR Description Checklist
supported_models.mdandexamplesfor a new model.