[Core]: KV Cache Transfer Encapsulation by princepride · Pull Request #979 · vllm-project/vllm-omni

princepride · 2026-01-27T09:23:13Z

PLEASE FILL IN THE PR DESCRIPTION HERE ENSURING ALL CHECKLIST ITEMS (AT THE BOTTOM) HAVE BEEN CONSIDERED.

Purpose

Related: #944
Refactor the KV cache transfer logic by extracting duplicated code from GPUARModelRunner and GPUDiffusionModelRunner into a unified OmniKVTransferManager class.

Test Plan

Unit Test

pytest -o "addopts=" tests/distributed/omni_connectors/test_kv_flow.py

End2End Test

python3 examples/offline_inference/bagel/end2end.py --prompts "A cute cat" --modality text2img

mooncake_master \
  --rpc_port=50051 \
  --enable_http_metadata_server=true \
  --http_metadata_server_host=0.0.0.0 \
  --http_metadata_server_port=8080 \
  --metrics_port=9003

# vllm-omni server

python3 examples/offline_inference/bagel/end2end.py --prompts "A cute cat" --modality text2img --stage-configs-path vllm_omni/model_executor/stage_configs/bagel_multiconnector.yaml

Test Result

========================================================================================= test session starts =========================================================================================
platform linux -- Python 3.13.11, pytest-9.0.2, pluggy-1.6.0
rootdir: /proj-tango-pvc/users/zhipeng.wang/workspace/vllm-omni
configfile: pyproject.toml
plugins: forked-1.6.0, rerunfailures-16.1, shard-0.1.2, timeout-2.4.0, anyio-4.12.1, asyncio-1.3.0
asyncio: mode=Mode.STRICT, debug=False, asyncio_default_fixture_loop_scope=None, asyncio_default_test_loop_scope=function
collected 3 items                                                                                                                                                                                     
Running 3 items in this shard

tests/distributed/omni_connectors/test_kv_flow.py ...                                                                                                                                           [100%]

========================================================================================== warnings summary ===========================================================================================
<frozen importlib._bootstrap>:488
  <frozen importlib._bootstrap>:488: DeprecationWarning: builtin type SwigPyPacked has no __module__ attribute

<frozen importlib._bootstrap>:488
  <frozen importlib._bootstrap>:488: DeprecationWarning: builtin type SwigPyObject has no __module__ attribute

tests/distributed/omni_connectors/test_kv_flow.py::TestKVFlow::test_manager_extraction
  /root/.local/share/uv/python/cpython-3.13.11-linux-x86_64-gnu/lib/python3.13/unittest/case.py:707: DeprecationWarning: It is deprecated to return a value that is not None from a test case (<bound method TestKVFlow.test_manager_extraction of <test_kv_flow.TestKVFlow testMethod=test_manager_extraction>>)
    return self.run(*args, **kwds)

-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html
=================================================================================== 3 passed, 3 warnings in 13.25s ====================================================================================

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: fe624efe17

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

vllm_omni/distributed/omni_connectors/kv_transfer_manager.py

Signed-off-by: princepride <[email protected]>

princepride · 2026-01-27T11:35:05Z

@natureofnature @tzhouam PTAL.

ZJY0516 · 2026-01-27T11:47:25Z

@codex review

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: c1c0e2dec6

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

vllm_omni/distributed/omni_connectors/kv_transfer_manager.py

Signed-off-by: princepride <[email protected]>

Copilot

Pull request overview

Refactors Omni KV-cache transfer by centralizing duplicated send/receive + connector lifecycle logic into a single OmniKVTransferManager, and updates AR/diffusion runners to delegate to it.

Changes:

Added OmniKVTransferManager to encapsulate connector creation, KV extraction, send-with-retry, and receive-with-timeout.
Updated GPUARModelRunner and GPUDiffusionModelRunner to use the new manager instead of inlined logic.
Adjusted SHM connector API for compatibility and updated KV-flow unit tests to validate the new manager.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 9 comments.

Show a summary per file

File	Description
vllm_omni/worker/gpu_ar_model_runner.py	Replaces inlined sender-side KV transfer with manager calls and removes duplicated methods.
vllm_omni/distributed/omni_connectors/kv_transfer_manager.py	Introduces the unified manager and shared KV transfer data container/config.
vllm_omni/distributed/omni_connectors/connectors/shm_connector.py	Extends `put/get` to accept `request_id` for compatibility with manager usage.
vllm_omni/diffusion/worker/gpu_diffusion_model_runner.py	Replaces receiver-side KV polling/injection with manager calls.
tests/distributed/omni_connectors/test_kv_flow.py	Updates tests to validate extraction/send/receive flows via the new manager.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

vllm_omni/distributed/omni_connectors/kv_transfer_manager.py

vllm_omni/distributed/omni_connectors/connectors/shm_connector.py

tests/distributed/omni_connectors/test_kv_flow.py

vllm_omni/worker/gpu_ar_model_runner.py

vllm_omni/distributed/omni_connectors/kv_transfer_manager.py

vllm_omni/distributed/omni_connectors/connectors/shm_connector.py

Signed-off-by: princepride <[email protected]>

princepride · 2026-01-28T06:19:41Z

@tzhouam @hsliuustc0106 PTAL😊

tests/distributed/omni_connectors/test_kv_flow.py

Signed-off-by: princepride <[email protected]>

Signed-off-by: 汪志鹏 <[email protected]>

princepride · 2026-01-28T09:16:13Z

@congw729 I want also add Bagel e2e pytest in this pr, what do you think?

Signed-off-by: princepride <[email protected]>

Signed-off-by: 汪志鹏 <[email protected]>

princepride · 2026-01-28T15:06:03Z

@hsliuustc0106 Ready to merge.

hsliuustc0106

lgtm

Signed-off-by: princepride <[email protected]> Signed-off-by: 汪志鹏 <[email protected]>

princepride requested a review from hsliuustc0106 as a code owner January 27, 2026 09:23

chatgpt-codex-connector bot reviewed Jan 27, 2026

View reviewed changes

vllm_omni/distributed/omni_connectors/kv_transfer_manager.py Outdated Show resolved Hide resolved

princepride added 6 commits January 27, 2026 10:53

kv cache transfer encapsulation

59cf113

Signed-off-by: princepride <[email protected]>

change kv_cache_manager to kv_transfer_manager

0ceb49a

Signed-off-by: princepride <[email protected]>

simplify code

49b6877

Signed-off-by: princepride <[email protected]>

adjust code

60fad32

Signed-off-by: princepride <[email protected]>

change test_kv_flow

849ca4e

Signed-off-by: princepride <[email protected]>

rebase code

6a12095

Signed-off-by: princepride <[email protected]>

princepride force-pushed the kv-cache-transfer-encapsulation branch from 0547aa2 to 6a12095 Compare January 27, 2026 10:54

princepride added 2 commits January 27, 2026 11:24

fix some bug

754cb4b

Signed-off-by: princepride <[email protected]>

fix some bug

c1c0e2d

Signed-off-by: princepride <[email protected]>

princepride mentioned this pull request Jan 27, 2026

[RFC]: Bagel deployment #936

Open

14 tasks

chatgpt-codex-connector bot reviewed Jan 27, 2026

View reviewed changes

vllm_omni/distributed/omni_connectors/kv_transfer_manager.py Outdated Show resolved Hide resolved

split none finished_reqs and none connector process

68daa19

Signed-off-by: princepride <[email protected]>

hsliuustc0106 requested a review from Copilot January 27, 2026 15:51

Copilot started reviewing on behalf of hsliuustc0106 January 27, 2026 15:51 View session

Copilot AI reviewed Jan 27, 2026

View reviewed changes

princepride added 2 commits January 27, 2026 21:30

followed AI recommend changed some code

f96df8a

Signed-off-by: princepride <[email protected]>

Merge branch 'main' into kv-cache-transfer-encapsulation

c3f613a

congw729 reviewed Jan 28, 2026

View reviewed changes

tests/distributed/omni_connectors/test_kv_flow.py Outdated Show resolved Hide resolved

refactor test_kv_flow

ed98762

Signed-off-by: princepride <[email protected]>

princepride mentioned this pull request Jan 28, 2026

[Bug]: Bagel text_to_image error related to connector messaging #1010

Closed

1 task

Merge branch 'main' into kv-cache-transfer-encapsulation

a86704c

Signed-off-by: 汪志鹏 <[email protected]>

david6666666 linked an issue Jan 28, 2026 that may be closed by this pull request

[Bug]: Bagel text_to_image error related to connector messaging #1010

Closed

1 task

princepride added 2 commits January 28, 2026 02:15

fix some bug

ea108d7

Signed-off-by: princepride <[email protected]>

fix some bug

ea83b97

Signed-off-by: princepride <[email protected]>

fix some bug

bd998f4

Signed-off-by: princepride <[email protected]>

hsliuustc0106 added the ready label to trigger buildkite CI label Jan 28, 2026

princepride added 4 commits January 28, 2026 06:05

fix some bug

d5356fa

Signed-off-by: princepride <[email protected]>

fix some bug

099972c

Signed-off-by: princepride <[email protected]>

fix some bug

3eb5380

Signed-off-by: princepride <[email protected]>

Merge branch 'main' into kv-cache-transfer-encapsulation

b38bf34

Signed-off-by: 汪志鹏 <[email protected]>

hsliuustc0106 approved these changes Jan 28, 2026

View reviewed changes

hsliuustc0106 merged commit 741f7e2 into vllm-project:main Jan 28, 2026
7 checks passed

david6666666 mentioned this pull request Jan 29, 2026

[Bug]: examples/offline_inference/bagel/end2end.py img2img, Image is noisy #1047

Closed

1 task

gcanlin mentioned this pull request Jan 30, 2026

[NPU] Align with GPUModelRunner #1114

Merged

5 tasks

dongbo910220 pushed a commit to dongbo910220/vllm-omni that referenced this pull request Feb 1, 2026

[Core]: KV Cache Transfer Encapsulation (vllm-project#979)

7de6d13

Signed-off-by: princepride <[email protected]> Signed-off-by: 汪志鹏 <[email protected]>

Conversation

princepride commented Jan 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Test Plan

Unit Test

End2End Test

Test Result

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

princepride commented Jan 27, 2026

Uh oh!

ZJY0516 commented Jan 27, 2026

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

princepride commented Jan 28, 2026

Uh oh!

Uh oh!

princepride commented Jan 28, 2026

Uh oh!

princepride commented Jan 28, 2026

Uh oh!

hsliuustc0106 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

princepride commented Jan 27, 2026 •

edited

Loading