-
-
Notifications
You must be signed in to change notification settings - Fork 11.6k
[NIXL] Add remote_request_id to kv_transfer_params #29665
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
Example of it working on top of #27987 Prefill side: Decode side, notice the decode request id is |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request introduces remote_request_id to make the NIXL notification more robust, which is a good improvement. The changes correctly plumb this new ID to where it's needed for constructing the notification ID. However, I've found a critical issue where direct dictionary access on kv_transfer_params could lead to a KeyError in certain paths. I've provided a suggestion to make this access safer.
Include the internal request ID that the prefill instance is expecting the decode instance to send it in the NIXL notification. Right now, we rely on the proxy supplying the ID via X-Request-ID and that prefill and decode will mangle this ID in identical ways. This is obviously quite brittle, and P should be explicit about what ID it expects from D. Relates to vllm-project#27987 - adding a random prefix to client-provided request IDs. Signed-off-by: Mark McLoughlin <[email protected]>
c519e25 to
9d04ff1
Compare
Include the internal request ID that the prefill instance is expecting the decode instance to send it in the NIXL notification.
Right now, we rely on the proxy supplying the ID via X-Request-ID and that prefill and decode will mangle this ID in identical ways. This is obviously quite brittle, and P should be explicit about what ID it expects from D.
Relates to #27987 - adding a random prefix to client-provided request IDs.