[NPU][Refactor] Rename Diffusion* to Generation* by gcanlin · Pull Request #211 · vllm-project/vllm-omni

gcanlin · 2025-12-05T07:45:31Z

PLEASE FILL IN THE PR DESCRIPTION HERE ENSURING ALL CHECKLIST ITEMS (AT THE BOTTOM) HAVE BEEN CONSIDERED.

Purpose

Follow #163.

Test Plan

python examples/offline_inference/qwen2_5_omni/end2end.py --output-wav output_audio --query-type text

Test Result

[Stage-2] Max batch size: 1
--------------------------------
[Stage-2] Received batch size=1, request_ids=[0]
--------------------------------
(EngineCore_DP0 pid=854566) INFO:vllm_omni.model_executor.models.qwen2_5_omni.qwen2_5_omni:Currently, we do not use the chunked process, we only use the token2wav.process_chunk for the whole sequence. The stream mode will be implemented in the future.
Modular Diffusers is currently an experimental feature under active development. The API is subject to breaking changes in future releases.
INFO 12-05 07:31:43 [__init__.py:36] Available plugins for group vllm.platform_plugins:
INFO 12-05 07:31:43 [__init__.py:38] - ascend -> vllm_ascend:register
INFO 12-05 07:31:43 [__init__.py:41] All plugins in this group will be loaded. Set `VLLM_PLUGINS` to control which plugins to load.
INFO 12-05 07:31:43 [__init__.py:207] Platform plugin ascend is activated
INFO 12-05 07:31:43 [importing.py:63] Triton not installed or not compatible; certain GPU-related functions will not be available.
WARNING 12-05 07:31:44 [_custom_ops.py:20] Failed to import from vllm._C with ModuleNotFoundError("No module named 'vllm._C'")
...[Stage-2] Generate done: batch=1, req_ids=[0], gen_ms=131502.7
('Warning: torch.save with "_use_new_zipfile_serialization = False" is not recommended for npu tensor, which may bring unexpected errors and hopefully set "_use_new_zipfile_serialization = True"', 'if it is necessary to use this, please convert the npu tensor to cpu tensor for saving')
INFO:vllm_omni.entrypoints.omni_llm:[Summary] {'e2e_requests': 1, 'e2e_total_time_ms': 155054.5117855072, 'e2e_sum_time_ms': 155054.1956424713, 'e2e_total_tokens': 0, 'e2e_avg_time_per_request_ms': 155054.1956424713, 'e2e_avg_tokens_per_s': 0.0, 'wall_time_ms': 155054.5117855072, 'final_stage_id': 2, 'stages': [{'stage_id': 0, 'requests': 1, 'tokens': 49, 'total_time_ms': 1526.3910293579102, 'avg_time_per_request_ms': 1526.3910293579102, 'avg_tokens_per_s': 32.101865811287084}, {'stage_id': 1, 'requests': 1, 'tokens': 865, 'total_time_ms': 21992.52414703369, 'avg_time_per_request_ms': 21992.52414703369, 'avg_tokens_per_s': 39.331547130149204}, {'stage_id': 2, 'requests': 1, 'tokens': 0, 'total_time_ms': 131512.49384880066, 'avg_time_per_request_ms': 131512.49384880066, 'avg_tokens_per_s': 0.0}], 'transfers': [{'from_stage': 0, 'to_stage': 1, 'samples': 1, 'total_bytes': 1652666, 'total_time_ms': 2.309560775756836, 'tx_mbps': 5724.607093600908, 'rx_samples': 1, 'rx_total_bytes': 1652666, 'rx_total_time_ms': 4.258155822753906, 'rx_mbps': 3104.9422685169093, 'total_samples': 1, 'total_transfer_time_ms': 7.098674774169922, 'total_mbps': 1862.506512921072}, {'from_stage': 1, 'to_stage': 2, 'samples': 1, 'total_bytes': 2631, 'total_time_ms': 0.35381317138671875, 'tx_mbps': 59.48902330997304, 'rx_samples': 1, 'rx_total_bytes': 2631, 'rx_total_time_ms': 0.05698204040527344, 'rx_mbps': 369.3795422259414, 'total_samples': 1, 'total_transfer_time_ms': 1.2493133544921875, 'total_mbps': 16.847654693129773}]}
Request ID: 0, Text saved to output_audio/00000.txt
Request ID: 0, Saved audio to output_audio/output_0.wav

Essential Elements of an Effective PR Description Checklist

The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
The test plan, such as providing test command.
The test results, such as pasting the results comparison before and after, or e2e results
(Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
(Optional) Release notes update. If your change is user facing, please update the release notes draft.

BEFORE SUBMITTING, PLEASE READ https://github.com/vllm-project/vllm-omni/blob/main/CONTRIBUTING.md (anything written below this line will be removed by GitHub Actions)

Signed-off-by: gcanlin <[email protected]>

chatgpt-codex-connector · 2025-12-05T07:45:36Z

The account who enabled Codex for this repo no longer has access to Codex. Please contact the admins of this repo to enable Codex again.

hsliuustc0106

lgtm

Signed-off-by: gcanlin <[email protected]> Signed-off-by: Prajwal A <[email protected]>

Signed-off-by: gcanlin <[email protected]> Signed-off-by: Fanli Lin <[email protected]>

Signed-off-by: gcanlin <[email protected]>

[NPU][Refactor] Rename Diffusion* to Generation*

2b38896

Signed-off-by: gcanlin <[email protected]>

gcanlin requested a review from hsliuustc0106 as a code owner December 5, 2025 07:45

hsliuustc0106 approved these changes Dec 5, 2025

View reviewed changes

hsliuustc0106 enabled auto-merge (squash) December 5, 2025 08:18

hsliuustc0106 merged commit 6079b14 into vllm-project:main Dec 5, 2025
4 checks passed

LawJarp-A pushed a commit to LawJarp-A/vllm-omni that referenced this pull request Dec 12, 2025

[NPU][Refactor] Rename Diffusion* to Generation* (vllm-project#211)

aa1aba8

Signed-off-by: gcanlin <[email protected]> Signed-off-by: Prajwal A <[email protected]>

LawJarp-A pushed a commit to LawJarp-A/vllm-omni that referenced this pull request Dec 12, 2025

[NPU][Refactor] Rename Diffusion* to Generation* (vllm-project#211)

9cec0ac

Signed-off-by: gcanlin <[email protected]> Signed-off-by: Prajwal A <[email protected]>

faaany pushed a commit to faaany/vllm-omni that referenced this pull request Dec 19, 2025

[NPU][Refactor] Rename Diffusion* to Generation* (vllm-project#211)

6052498

Signed-off-by: gcanlin <[email protected]> Signed-off-by: Fanli Lin <[email protected]>

princepride pushed a commit to princepride/vllm-omni that referenced this pull request Jan 10, 2026

[NPU][Refactor] Rename Diffusion* to Generation* (vllm-project#211)

7626071

Signed-off-by: gcanlin <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Comments

[NPU][Refactor] Rename Diffusion* to Generation*#211

[NPU][Refactor] Rename Diffusion* to Generation*#211
hsliuustc0106 merged 1 commit intovllm-project:mainfrom
gcanlin:rename

gcanlin commented Dec 5, 2025 •

edited

Loading

Uh oh!

chatgpt-codex-connector bot commented Dec 5, 2025

Uh oh!

hsliuustc0106 left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Comments

Conversation

gcanlin commented Dec 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Test Plan

Test Result

Uh oh!

chatgpt-codex-connector bot commented Dec 5, 2025

Uh oh!

hsliuustc0106 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

gcanlin commented Dec 5, 2025 •

edited

Loading