-
Notifications
You must be signed in to change notification settings - Fork 53
vllm 0.16.0 Support in current plugin #769
Copy link
Copy link
Closed
Labels
help wantedExtra attention is neededExtra attention is neededvllm-spyre-oldRelated to the continued maintenance of the old `vllm-spyre` plugin on the `torch_sendnn` stack.Related to the continued maintenance of the old `vllm-spyre` plugin on the `torch_sendnn` stack.
Metadata
Metadata
Assignees
Labels
help wantedExtra attention is neededExtra attention is neededvllm-spyre-oldRelated to the continued maintenance of the old `vllm-spyre` plugin on the `torch_sendnn` stack.Related to the continued maintenance of the old `vllm-spyre` plugin on the `torch_sendnn` stack.
Type
Projects
Status
Done
Feature description
Add support vllm 0.16.0 in the current plugin
Motivation and context
vllm version 0.16.0 contains the PR#32863 that will fix wrong error message bug #33418. There is an internal request to fix this bug appearing when the length of the request doesn't fit the max_context length
cc: @karthick-vasakar @yannicks1
Proposed solution
No response
Checklist