Skip to content

[Feature] Omni Connector + ray supported#215

Merged
hsliuustc0106 merged 9 commits intovllm-project:mainfrom
natureofnature:wzliu_connector_dev
Dec 10, 2025
Merged

[Feature] Omni Connector + ray supported#215
hsliuustc0106 merged 9 commits intovllm-project:mainfrom
natureofnature:wzliu_connector_dev

Conversation

@natureofnature
Copy link
Contributor

@natureofnature natureofnature commented Dec 5, 2025

  1. added omni mooncake connectors to distributed directory, integrate omni connector to omni-vllm
  2. ray support, support omni stage execution across nodes
  3. move shared memory communication to omni connector

Purpose

  1. Create unified connector (OmniConnector) for Multimodal Full Disaggregation (Encode/Prefill/Decode/Generator) [RFC]: OmniConnector for Multimodal Full Disaggregation (Encode/Prefill/Decode/Generator) #62
  2. Support running on distributed environment
  3. Create a relatively standalone modules for communication and distributed execution.

For more details, please refer to Design document

Test Plan

  1. Qwen3-omni
  2. Hunyuan image

Test Result

Below result used Qwen2-omni, using command

python openai_chat_completion_client_for_multimodal_generation.py --query-type text

Ray + omni connector

vllm serve /workspace/Qwen2.5-Omni-7B/ --omni --port 8091 --worker-backend ray --ray-address auto --stage-configs-path  vllm-omni/vllm_omni/model_executor/stage_configs/qwen2_5_omni_multiconnector.yaml
Screenshot 2025-11-24 111150 Screenshot 2025-11-24 111150

Multiprocessing + omni connector

vllm serve /workspace/Qwen2.5-Omni-7B/ --omni --port 8091 --stage-configs-path vllm-omni/vllm_omni/model_executor/stage_configs/qwen2_5_omni_multiconnector.yaml 
Screenshot 2025-11-24 111150

Multiprocessing + shared memory connector

vllm serve /workspace/Qwen2.5-Omni-7B/ --omni --port 8091
Screenshot 2025-11-24 111150
Essential Elements of an Effective PR Description Checklist
  • The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
  • The test plan, such as providing test command.
  • The test results, such as pasting the results comparison before and after, or e2e results
  • (Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
  • (Optional) Release notes update. If your change is user facing, please update the release notes draft.

BEFORE SUBMITTING, PLEASE READ https://github.com/vllm-project/vllm-omni/blob/main/CONTRIBUTING.md (anything written below this line will be removed by GitHub Actions)

@chatgpt-codex-connector
Copy link

The account who enabled Codex for this repo no longer has access to Codex. Please contact the admins of this repo to enable Codex again.

1. added omni mooncake connectors to distributed directory, integrate omni connector to omni-llm, rebased with main (3 times).
2. ray + mooncake connector work on 2 nodes
3. added shared memory communication to omni connector

Signed-off-by: wzliu <wzliu@connect.hku.hk>

Signed-off-by: wzliu <wzliu@connect.hku.hk>
Copy link
Collaborator

@hsliuustc0106 hsliuustc0106 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what if I have 2 instances for stage 0 and 3 instances for stage 1? How can I code such scenerio?

Copy link
Collaborator

@Gaohan123 Gaohan123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall, the implementation makes sense. Please resolve tests problems and introduce more details in the weekly meeting. Thanks!

2. fix default threshold
3. fix connector skipping error problem and try to raise connector error while connectors not existing for edges
4. process to multi_process
5. udpate design documents
6. fix yaml

Signed-off-by: wzliu <wzliu@connect.hku.hk>
@natureofnature
Copy link
Contributor Author

what if I have 2 instances for stage 0 and 3 instances for stage 1? How can I code such scenerio?

Currently not well supported, we may treat each instance as a single stage because the orchestrator uses an asynchronous queue to schedule between stages. This feature can be added in the in P1.

Signed-off-by: wzliu <wzliu@connect.hku.hk>
2. fix yaml config for multi connector

Signed-off-by: wzliu <wzliu@connect.hku.hk>
Signed-off-by: wzliu <wzliu@connect.hku.hk>
@hsliuustc0106
Copy link
Collaborator

any test?

Signed-off-by: wzliu <wzliu@connect.hku.hk>
@hsliuustc0106
Copy link
Collaborator

I propose to add #203 Bagel as an example to test the througput improvement

Copy link
Collaborator

@hsliuustc0106 hsliuustc0106 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a few short comment left to be fixed

Signed-off-by: wzliu <wzliu@connect.hku.hk>
@hsliuustc0106 hsliuustc0106 enabled auto-merge (squash) December 10, 2025 11:06
@hsliuustc0106 hsliuustc0106 self-requested a review December 10, 2025 12:13
Copy link
Collaborator

@hsliuustc0106 hsliuustc0106 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, thanks for such a great job

@hsliuustc0106 hsliuustc0106 merged commit f995211 into vllm-project:main Dec 10, 2025
4 checks passed
LawJarp-A pushed a commit to LawJarp-A/vllm-omni that referenced this pull request Dec 12, 2025
Signed-off-by: wzliu <wzliu@connect.hku.hk>
Signed-off-by: Prajwal A <prajwalanagani@gmail.com>
LawJarp-A pushed a commit to LawJarp-A/vllm-omni that referenced this pull request Dec 12, 2025
Signed-off-by: wzliu <wzliu@connect.hku.hk>
Signed-off-by: Prajwal A <prajwalanagani@gmail.com>
e1ijah1 pushed a commit to e1ijah1/vllm-omni that referenced this pull request Dec 14, 2025
Signed-off-by: wzliu <wzliu@connect.hku.hk>
Signed-off-by: elijah <f1renze.142857@gmail.com>
e1ijah1 pushed a commit to e1ijah1/vllm-omni that referenced this pull request Dec 14, 2025
Signed-off-by: wzliu <wzliu@connect.hku.hk>
Signed-off-by: elijah <f1renze.142857@gmail.com>
faaany pushed a commit to faaany/vllm-omni that referenced this pull request Dec 19, 2025
Signed-off-by: wzliu <wzliu@connect.hku.hk>
Signed-off-by: Fanli Lin <fanli.lin@intel.com>
@amy-why-3459 amy-why-3459 mentioned this pull request Dec 27, 2025
31 tasks
princepride pushed a commit to princepride/vllm-omni that referenced this pull request Jan 10, 2026
Signed-off-by: wzliu <wzliu@connect.hku.hk>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[RFC]: OmniConnector for Multimodal Full Disaggregation (Encode/Prefill/Decode/Generator) [New Model]: ByteDance-Seed/BAGEL-7B-MoT

3 participants