[Diffusion][Feature] CFG parallel support for Qwen-Image by wtomin · Pull Request #444 · vllm-project/vllm-omni

wtomin · 2025-12-24T02:37:18Z

PLEASE FILL IN THE PR DESCRIPTION HERE ENSURING ALL CHECKLIST ITEMS (AT THE BOTTOM) HAVE BEEN CONSIDERED.

Purpose

This PR aims to support CFG-Parallel for Qwen-Image series of models. CFG-parallel runs the positive/negative prompts of classifier-free guidance (CFG) on different devices, then merges on a single device to perform the scheduler step. It includes:

Edits the pipelines: split the positive/negative prompts in diffuse function, and merge the predicted noise before scheduler step.
pipeline_qwen_image.py
pipeline_qwen_image_edit.py
pipeline_qwen_image_edit_plus.py
pipeline_qwen_image_layered.py
Edits the image_edit.py and text_to_image.py to support cfg_parallel_size argument.
Update documents.

Test Plan

image generation

python examples/offline_inference/text_to_image/text_to_image.py --cfg_parallel_size 2

python examples/offline_inference/text_to_image/text_to_image.py --cfg_parallel_size 2 --cache_backend tea_cache

image edit

python examples/offline_inference/image_to_image/image_edit.py --model "Qwen/Qwen-Image-Edit" --image ./qwen_image_output.png --prompt "turn this coffee cup to a glass of wine'" --output output_image_edit.png --num_inference_steps 50 --cfg_scale 4.0 --cfg_parallel_size 2

python examples/offline_inference/image_to_image/image_edit.py --model "Qwen/Qwen-Image-Edit" --image ./qwen_image_output.png --prompt "turn this coffee cup to a glass of wine'" --output output_image_edit.png --num_inference_steps 50 --cfg_scale 4.0 --cache_backend tea_cache --cfg_parallel_size 2

python examples/offline_inference/image_to_image/image_edit.py --model "Qwen/Qwen-Image-Edit-2509" --image ./qwen_image_output.png --prompt "turn this coffee cup to a glass of wine'" --output output_image_edit.png --num_inference_steps 50 --cfg_scale 4.0 --cfg_parallel_size 2

python examples/offline_inference/image_to_image/image_edit.py --model "Qwen/Qwen-Image-Layered" --image ./qwen_image_output.png --prompt "turn this coffee cup to a glass of wine'" --output output_image_edit.png --num_inference_steps 50 --cfg_scale 4.0 --layers 2 --color-format "RGBA" --output "layered" --cfg_parallel_size 2

Test Result

task	model	cfg_parallel_size	time
T2I	`Qwen/Qwen-Image`	1	20.5s
T2I	`Qwen/Qwen-Image`	2	13.08s
I2I	`Qwen/Qwen-Image-Edit`	1	54.3s
I2I	`Qwen/Qwen-Image-Edit`	2	29.9s
I2I	`Qwen/Qwen-Image-Edit-2509`	1	45.0s
I2I	`Qwen/Qwen-Image-Edit-2509`	2	25.5s
I2I	`Qwen/Qwen-Image-Layered`	1	32.3s
I2I	`Qwen/Qwen-Image-Layered`	2	19.3s

task	cache backend	model	cfg_parallel_size	time
T2I	tea_cache	`Qwen/Qwen-Image`	1	12.5s
T2I	tea_cache	`Qwen/Qwen-Image`	2	10.6s
I2I	tea_cache	`Qwen/Qwen-Image-Edit`	1	24.35s
I2I	tea_cache	`Qwen/Qwen-Image-Edit`	2	17.06s

Setting:

vllm: 0.12.0
pytorch: 2.9.0
python: 3.12
cuda: 12.8

Essential Elements of an Effective PR Description Checklist

The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
The test plan, such as providing test command.
The test results, such as pasting the results comparison before and after, or e2e results
(Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
(Optional) Release notes update. If your change is user facing, please update the release notes draft.

BEFORE SUBMITTING, PLEASE READ https://github.com/vllm-project/vllm-omni/blob/main/CONTRIBUTING.md (anything written below this line will be removed by GitHub Actions)

ZJY0516 · 2025-12-24T12:32:08Z

looking forward to it

hsliuustc0106 · 2025-12-25T00:41:43Z

do we expect it to be merged before 1230 release? @wtomin

wtomin · 2025-12-25T07:00:42Z

do we expect it to be merged before 1230 release? @wtomin

I think so. I will get it done today.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

vllm_omni/diffusion/models/qwen_image/pipeline_qwen_image.py

hsliuustc0106 · 2025-12-30T11:23:29Z

any progress or comment @gcanlin @ZJY0516 @wtomin @SamitHuang

gcanlin

Would CGF parallel perform better than USP? If roughly the same, I’d prefer to move this to the next release to ensure the quality of 0.12.0.

ZJY0516 · 2025-12-30T12:31:27Z

Would CGF parallel perform better than USP? If roughly the same, I’d prefer to move this to the next release to ensure the quality of 0.12.0.

They are orthogonal

wtomin · 2026-01-05T10:33:29Z

@hsliuustc0106 The current branch is compatible with tea_cache.

hsliuustc0106 · 2026-01-06T00:20:55Z

fix precommit please

hsliuustc0106 · 2026-01-06T00:21:30Z

@ZJY0516 PTAL final check

gcanlin · 2026-01-06T03:39:38Z

Also work on NPU. Thanks!

python text_to_image.py --cfg_parallel_size 2

And it achieved the expected speedup.

model	config	time
`Qwen-Image`	`--cfg_parallel_size 2`	58s
`Qwen-Image`	`--cfg_parallel_size 1`	37s

Signed-off-by: Didan Deng <[email protected]>

hsliuustc0106

lgtm

david6666666 · 2026-01-07T01:19:37Z

@wtomin lack of parameter passing in online serving scenarios, please add

…t#444) Signed-off-by: Didan Deng <[email protected]>

david6666666 mentioned this pull request Dec 25, 2025

[RFC]: DiT model and feature support enhancement #85

Closed

58 tasks

wtomin force-pushed the cfg-fix branch from 7d6747b to 7b44817 Compare December 25, 2025 07:05

wtomin marked this pull request as ready for review December 26, 2025 08:25

wtomin requested a review from hsliuustc0106 as a code owner December 26, 2025 08:25

chatgpt-codex-connector bot reviewed Dec 26, 2025

View reviewed changes

vllm_omni/diffusion/models/qwen_image/pipeline_qwen_image.py Show resolved Hide resolved

gcanlin reviewed Dec 30, 2025

View reviewed changes

wtomin force-pushed the cfg-fix branch 2 times, most recently from 9107039 to b96d386 Compare January 5, 2026 10:32

hsliuustc0106 added the ready label to trigger buildkite CI label Jan 5, 2026

wtomin added 12 commits January 6, 2026 15:25

cfg updates

5f28103

Signed-off-by: Didan Deng <[email protected]>

python script args

7fa92a3

Signed-off-by: Didan Deng <[email protected]>

edit qwen-image edit pipeline for cfg parallel

42e4b6a

Signed-off-by: Didan Deng <[email protected]>

edit qwen-image edit plus pipeline for cfg parallel

67befe7

Signed-off-by: Didan Deng <[email protected]>

edit qwen-image layered pipeline for cfg parallel

22e17ec

Signed-off-by: Didan Deng <[email protected]>

edit docs

67a3062

Signed-off-by: Didan Deng <[email protected]>

update docs

47b2263

Signed-off-by: Didan Deng <[email protected]>

init layered

6724026

Signed-off-by: Didan Deng <[email protected]>

update layered pipeline

e70fd6b

Signed-off-by: Didan Deng <[email protected]>

update teacache hook

6c505bb

Signed-off-by: Didan Deng <[email protected]>

fix ci

0bc2a53

Signed-off-by: Didan Deng <[email protected]>

improve doc

dfe982b

Signed-off-by: Didan Deng <[email protected]>

wtomin added 3 commits January 6, 2026 15:27

improve doc

40963f3

Signed-off-by: Didan Deng <[email protected]>

add model in docs

bc7d4b2

Signed-off-by: Didan Deng <[email protected]>

pre-commit fix

65f59e2

Signed-off-by: Didan Deng <[email protected]>

wtomin force-pushed the cfg-fix branch from 0bfce1c to 65f59e2 Compare January 6, 2026 07:28

hsliuustc0106 approved these changes Jan 6, 2026

View reviewed changes

hsliuustc0106 merged commit 14c04d0 into vllm-project:main Jan 6, 2026
7 checks passed

Shirley125 pushed a commit to Shirley125/vllm-omni that referenced this pull request Jan 9, 2026

[Diffusion][Feature] CFG parallel support for Qwen-Image (vllm-projec…

c656215

…t#444) Signed-off-by: Didan Deng <[email protected]>

princepride pushed a commit to princepride/vllm-omni that referenced this pull request Jan 10, 2026

[Diffusion][Feature] CFG parallel support for Qwen-Image (vllm-projec…

f5cf566

…t#444) Signed-off-by: Didan Deng <[email protected]>

sniper35 pushed a commit to sniper35/vllm-omni that referenced this pull request Jan 10, 2026

[Diffusion][Feature] CFG parallel support for Qwen-Image (vllm-projec…

0adf011

…t#444) Signed-off-by: Didan Deng <[email protected]>

ZJY0516 pushed a commit to LawJarp-A/vllm-omni that referenced this pull request Jan 10, 2026

[Diffusion][Feature] CFG parallel support for Qwen-Image (vllm-projec…

3c14e98

…t#444) Signed-off-by: Didan Deng <[email protected]>

This was referenced Jan 16, 2026

[RFC]: Diffusion Models Features Supports Plan #814

Open

[RFC]: CFG Parallelism Abstraction #850

Closed

wtomin deleted the cfg-fix branch February 2, 2026 07:24

Conversation

wtomin commented Dec 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Test Plan

Test Result

Uh oh!

ZJY0516 commented Dec 24, 2025

Uh oh!

hsliuustc0106 commented Dec 25, 2025

Uh oh!

wtomin commented Dec 25, 2025

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

hsliuustc0106 commented Dec 30, 2025

Uh oh!

gcanlin left a comment

Choose a reason for hiding this comment

Uh oh!

ZJY0516 commented Dec 30, 2025

Uh oh!

wtomin commented Jan 5, 2026

Uh oh!

hsliuustc0106 commented Jan 6, 2026

Uh oh!

hsliuustc0106 commented Jan 6, 2026

Uh oh!

gcanlin commented Jan 6, 2026

Uh oh!

hsliuustc0106 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

david6666666 commented Jan 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

wtomin commented Dec 24, 2025 •

edited

Loading