Skip to content

Conversation

@NickLucche
Copy link
Collaborator

@NickLucche NickLucche commented Oct 30, 2025

Small quality of life improvement by removing some of the kv transfer-specifc code in layer.py and refactoring that into a decorator. I believe the on entry-on exit pattern (wait_read-wait_write) here is very suitable for that.
The result is simply that there's less non-attention related code in the file. Behavior should be unchanged.

Also, I found that after grouping some common boilerplate code for both maybe_save_kv_layer_to_connector and wait_for_kv_layer_from_connector, I think there's too little left to justify a separate function for both, hence I ended-up inlining both connector method calls.

cc @ApostaC who wrote the initial connector code

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request refactors the KV cache transfer logic into a decorator, which is a great improvement for code clarity and maintainability in layer.py. The implementation is clean and correctly captures the on-entry/on-exit pattern.

However, I've found a critical issue in the new decorator. It attempts to access the layer_name parameter from keyword arguments, but this parameter is passed positionally in all call sites. This will cause a KeyError at runtime. I've provided a suggestion to fix this by using the inspect module to robustly retrieve the argument.

Once this is addressed, the PR will be in excellent shape.

@NickLucche
Copy link
Collaborator Author

@codex review

@markmc
Copy link
Member

markmc commented Oct 31, 2025

Rebase to pick up #27861 and #27811

Copy link
Member

@markmc markmc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Love the idea in general, suggestion inline

@markmc
Copy link
Member

markmc commented Oct 31, 2025

Oh, see also #27026 - I'm proposing skipping the wait_for_layer_load() and save_kv_layer() if there is no KV connector metadata bound as a quick fix for #26675

@NickLucche NickLucche force-pushed the refactor-wait-kv-load branch from 8abc2cd to e3914fa Compare November 3, 2025 13:45
@NickLucche
Copy link
Collaborator Author

Thanks for reviewing @markmc !

Copy link
Collaborator

@ProExpertProg ProExpertProg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a few nits and Qs

@NickLucche NickLucche force-pushed the refactor-wait-kv-load branch from e3914fa to 5546494 Compare November 4, 2025 17:28
@NickLucche NickLucche added the ready ONLY add when PR is ready to merge/full CI is needed label Nov 5, 2025
@NickLucche NickLucche closed this Nov 7, 2025
@NickLucche NickLucche reopened this Nov 7, 2025
@NickLucche NickLucche force-pushed the refactor-wait-kv-load branch from 5f24f38 to f3b6244 Compare November 9, 2025 12:24
@mergify
Copy link

mergify bot commented Nov 11, 2025

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @NickLucche.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

@mergify mergify bot added the needs-rebase label Nov 11, 2025
@NickLucche NickLucche force-pushed the refactor-wait-kv-load branch from 9a18fa5 to 900be1f Compare November 11, 2025 17:34
@mergify mergify bot removed the needs-rebase label Nov 11, 2025
@NickLucche
Copy link
Collaborator Author

@hmellor can you tell why the docs build really dislikes this PR? :)

Copy link
Collaborator

@ProExpertProg ProExpertProg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice cleanup

Returns:
A tuple containing:
- attn_metadata: Attention metadata for this specific layer, or None if
no metadata available
Copy link
Member

@hmellor hmellor Nov 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@NickLucche https://app.readthedocs.org/projects/vllm/builds/30282990/#293811637--1144

Suggested change
no metadata available
no metadata available

or

Suggested change
no metadata available
no metadata available

(not sure which will render better)

@ptovam
Copy link
Contributor

ptovam commented Nov 11, 2025

Oh, see also #27026 - I'm proposing skipping the wait_for_layer_load() and save_kv_layer() if there is no KV connector metadata bound as a quick fix for #26675

@NickLucche any reason for not checking connector.has_connector_metadata() in the decorator? (was introduced in #28253)

@NickLucche
Copy link
Collaborator Author

Thanks @ptovam for spotting this! Rebase cruft on me

NickLucche and others added 5 commits November 12, 2025 10:12
Signed-off-by: NickLucche <[email protected]>
Reduces code duplication between the maybe_transfer_kv_layer and
the functions it decorates.

Signed-off-by: Mark McLoughlin <[email protected]>
Signed-off-by: NickLucche <[email protected]>
Signed-off-by: NickLucche <[email protected]>
@NickLucche NickLucche force-pushed the refactor-wait-kv-load branch from 9d06f87 to 7ba7765 Compare November 12, 2025 10:12
@NickLucche NickLucche enabled auto-merge (squash) November 12, 2025 10:34
@NickLucche NickLucche merged commit 728a9eb into vllm-project:main Nov 12, 2025
47 checks passed
geodavic pushed a commit to geodavic/vllm that referenced this pull request Nov 16, 2025
…oject#27816)

Signed-off-by: NickLucche <[email protected]>
Signed-off-by: Mark McLoughlin <[email protected]>
Co-authored-by: Mark McLoughlin <[email protected]>
Signed-off-by: George D. Torres <[email protected]>
bwasti pushed a commit to bwasti/vllm that referenced this pull request Nov 17, 2025
…oject#27816)

Signed-off-by: NickLucche <[email protected]>
Signed-off-by: Mark McLoughlin <[email protected]>
Co-authored-by: Mark McLoughlin <[email protected]>
Signed-off-by: Bram Wasti <[email protected]>
devpatelio pushed a commit to SumanthRH/vllm that referenced this pull request Nov 29, 2025
…oject#27816)

Signed-off-by: NickLucche <[email protected]>
Signed-off-by: Mark McLoughlin <[email protected]>
Co-authored-by: Mark McLoughlin <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ready ONLY add when PR is ready to merge/full CI is needed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants