[Core]: KV Cache Transfer Encapsulation#979
[Core]: KV Cache Transfer Encapsulation#979hsliuustc0106 merged 20 commits intovllm-project:mainfrom
Conversation
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: fe624efe17
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
Signed-off-by: princepride <[email protected]>
Signed-off-by: princepride <[email protected]>
Signed-off-by: princepride <[email protected]>
Signed-off-by: princepride <[email protected]>
Signed-off-by: princepride <[email protected]>
Signed-off-by: princepride <[email protected]>
0547aa2 to
6a12095
Compare
Signed-off-by: princepride <[email protected]>
Signed-off-by: princepride <[email protected]>
|
@natureofnature @tzhouam PTAL. |
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: c1c0e2dec6
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
Signed-off-by: princepride <[email protected]>
There was a problem hiding this comment.
Pull request overview
Refactors Omni KV-cache transfer by centralizing duplicated send/receive + connector lifecycle logic into a single OmniKVTransferManager, and updates AR/diffusion runners to delegate to it.
Changes:
- Added
OmniKVTransferManagerto encapsulate connector creation, KV extraction, send-with-retry, and receive-with-timeout. - Updated
GPUARModelRunnerandGPUDiffusionModelRunnerto use the new manager instead of inlined logic. - Adjusted SHM connector API for compatibility and updated KV-flow unit tests to validate the new manager.
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 9 comments.
Show a summary per file
| File | Description |
|---|---|
| vllm_omni/worker/gpu_ar_model_runner.py | Replaces inlined sender-side KV transfer with manager calls and removes duplicated methods. |
| vllm_omni/distributed/omni_connectors/kv_transfer_manager.py | Introduces the unified manager and shared KV transfer data container/config. |
| vllm_omni/distributed/omni_connectors/connectors/shm_connector.py | Extends put/get to accept request_id for compatibility with manager usage. |
| vllm_omni/diffusion/worker/gpu_diffusion_model_runner.py | Replaces receiver-side KV polling/injection with manager calls. |
| tests/distributed/omni_connectors/test_kv_flow.py | Updates tests to validate extraction/send/receive flows via the new manager. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
vllm_omni/distributed/omni_connectors/connectors/shm_connector.py
Outdated
Show resolved
Hide resolved
vllm_omni/distributed/omni_connectors/connectors/shm_connector.py
Outdated
Show resolved
Hide resolved
vllm_omni/distributed/omni_connectors/connectors/shm_connector.py
Outdated
Show resolved
Hide resolved
Signed-off-by: princepride <[email protected]>
|
@tzhouam @hsliuustc0106 PTAL😊 |
Signed-off-by: princepride <[email protected]>
Signed-off-by: 汪志鹏 <[email protected]>
|
@congw729 I want also add Bagel e2e pytest in this pr, what do you think? |
Signed-off-by: princepride <[email protected]>
Signed-off-by: princepride <[email protected]>
Signed-off-by: princepride <[email protected]>
Signed-off-by: princepride <[email protected]>
Signed-off-by: princepride <[email protected]>
Signed-off-by: princepride <[email protected]>
Signed-off-by: 汪志鹏 <[email protected]>
|
@hsliuustc0106 Ready to merge. |
Signed-off-by: princepride <[email protected]> Signed-off-by: 汪志鹏 <[email protected]>
PLEASE FILL IN THE PR DESCRIPTION HERE ENSURING ALL CHECKLIST ITEMS (AT THE BOTTOM) HAVE BEEN CONSIDERED.
Purpose
Related: #944
Refactor the KV cache transfer logic by extracting duplicated code from GPUARModelRunner and GPUDiffusionModelRunner into a unified OmniKVTransferManager class.
Test Plan
Unit Test
End2End Test
Test Result