[New model] Support model qwen image layered by Bounty-hunter · Pull Request #381 · vllm-project/vllm-omni

Bounty-hunter · 2025-12-19T10:50:37Z

PLEASE FILL IN THE PR DESCRIPTION HERE ENSURING ALL CHECKLIST ITEMS (AT THE BOTTOM) HAVE BEEN CONSIDERED.

Purpose

Support Qwen image Layered model

Run with image_edit.py，parameters of strong correlation are --color-format "RGBA" and --layers x

Test Plan

(1) run with vllm_omni

python image_edit.py --model "Qwen/Qwen-Image-Layered"   --image xxx --output "image_1_layered_" --num_inference_step 50 --cfg_scale 4.0 --layers x --prompt "" --color-format "RGBA"  --seed 777

(2) run with diffusers

from diffusers import QwenImageLayeredPipeline
import torch
from PIL import Image

pipeline = QwenImageLayeredPipeline.from_pretrained("Qwen/Qwen-Image-Layered")
pipeline = pipeline.to("cuda", torch.bfloat16)
pipeline.set_progress_bar_config(disable=None)

image = Image.open("/home/d00806799/code/vllm-omni/examples/offline_inference/image_to_image/vllm.jpg").convert("RGBA")
inputs = {
    "image": image,
    "generator": torch.Generator(device='cuda').manual_seed(777),
    "true_cfg_scale": 4.0,
    "negative_prompt": " ",
    "num_inference_steps": 50,
    "num_images_per_prompt": 1,
    "layers": 3,
    "resolution": 640,      # Using different bucket (640, 1024) to determine the resolution. For this version, 640 is recommended
    "cfg_normalize": False,  # Whether enable cfg normalization.
    "use_en_prompt": False,  # Automatic caption language if user does not provide caption
}

with torch.inference_mode():
    output = pipeline(**inputs)
    output_image = output.images[0]

for i, image in enumerate(output_image):
    image.save(f"{i}.png")

(3) Run image edit also success

INFO 12-20 14:16:47 [omni_diffusion.py:86] Prepared 1 requests for generation.
INFO 12-20 14:16:47 [diffusion_engine.py:43] Pre-processing completed in 0.0680 seconds
INFO 12-20 14:17:47 [shm_broadcast.py:501] No available shared memory broadcast block found in 60 seconds. This typically happens when some processes are hanging or doing some time-consuming work (e.g. compilation, weight/kv cache quantization).
INFO 12-20 14:18:43 [diffusion_engine.py:48] Generation completed successfully.
INFO 12-20 14:18:43 [diffusion_engine.py:53] Post-processing completed in 0.0667 seconds
Total generation time: 116.1441 seconds (116144.12 ms)
Saved edited image to /home/d00806799/code/vllm-omni/examples/offline_inference/image_to_image/output_image_edit_0.png
INFO 12-20 14:18:43 [gpu_worker.py:198] Worker 0: Received shutdown message
INFO 12-20 14:18:43 [gpu_worker.py:222] event loop terminated.
INFO 12-20 14:18:43 [gpu_worker.py:253] Worker 0: Shutdown complete.

Test Result

(1) vllm-omni 2-layers

(2) diffusers 2-layers

(3) vllm-omni 3-layers

(4) diffusers 3-layers

Essential Elements of an Effective PR Description Checklist

The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
The test plan, such as providing test command.
The test results, such as pasting the results comparison before and after, or e2e results
(Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
(Optional) Release notes update. If your change is user facing, please update the release notes draft.

BEFORE SUBMITTING, PLEASE READ https://github.com/vllm-project/vllm-omni/blob/main/CONTRIBUTING.md (anything written below this line will be removed by GitHub Actions)

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

chatgpt-codex-connector · 2025-12-19T10:53:57Z

vllm_omni/diffusion/models/qwen_image/pipeline_qwen_image_layered.py

+            assert requests.resolution in [640, 1024], (
+                f"resolution must be either 640 or 1024, but got {requests.resolution}"
+            )
+            calculated_width, calculated_height = calculate_dimensions(
+                requests.resolution * requests.resolution, image_size[0] / image_size[1]


Use request resolution instead of list attribute

Inside the pre-processing loop the resolution is read from requests rather than the individual req, so the function raises an AttributeError before any request is processed because the list object has no resolution attribute. This prevents the new layered pipeline from calculating dimensions or running at all.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2025-12-19T10:53:57Z

vllm_omni/diffusion/models/qwen_image/qwen_image_transformer.py

        temb = (
-            self.time_text_embed(timestep, hidden_states)
+            self.time_text_embed(timestep, hidden_states, additional_t_cond)
            if guidance is None
-            else self.time_text_embed(timestep, guidance, hidden_states)
+            else self.time_text_embed(timestep, guidance, hidden_states, additional_t_cond)


Guidance branch calls time embedding with wrong signature

The guidance code path calls self.time_text_embed(timestep, guidance, hidden_states, additional_t_cond), but QwenTimestepProjEmbeddings.forward only accepts (timestep, hidden_states, addition_t_cond=None). When guidance is enabled (e.g., for guidance-distilled models), this path raises a TypeError before any diffusion steps run.

Useful? React with 👍 / 👎.

hsliuustc0106 · 2025-12-19T14:09:01Z

add test ci please

ZJY0516

please also test if this break qwen-image and qwen-image-edit

vllm_omni/diffusion/models/qwen_image/pipeline_qwen_image_layered.py

examples/offline_inference/image_to_image/image_layered.py

vllm_omni/diffusion/request.py

vllm_omni/diffusion/models/qwen_image/pipeline_qwen_image_layered.py

ZJY0516 · 2025-12-20T13:18:10Z

vllm_omni/diffusion/models/qwen_image/autoencoder_kl_qwenimage.py

@@ -0,0 +1,1054 @@
+# Copyright 2025 The Qwen-Image Team, Wan Team and The HuggingFace Team. All rights reserved.


We can import this from diffusers directly

The modify for image layered in autoencoder_kl_qwenimage.py only in main branch; and current diffusers release version is 0.36

User should install latest diffusers from source as mentioned in https://huggingface.co/Qwen/Qwen-Image-Layered#quick-start

remove this file please

shall we leave it for a later PR fixing this?

vllm_omni/diffusion/diffusion_engine.py

Signed-off-by: dengyunyang <[email protected]>

hsliuustc0106 · 2025-12-20T14:20:42Z

add the test result and provide the run example command

Bounty-hunter · 2025-12-20T14:31:54Z

add the test result and provide the run example command
done

hsliuustc0106

lgtm

examples/offline_inference/image_to_image/image_edit.py

Signed-off-by: dengyunyang <[email protected]> Signed-off-by: Didan Deng <[email protected]>

Signed-off-by: dengyunyang <[email protected]> Signed-off-by: wangyu31577 <[email protected]>

Signed-off-by: dengyunyang <[email protected]>

Bounty-hunter requested a review from hsliuustc0106 as a code owner December 19, 2025 10:50

chatgpt-codex-connector bot reviewed Dec 19, 2025

View reviewed changes

hsliuustc0106 requested a review from ZJY0516 December 19, 2025 14:09

ZJY0516 reviewed Dec 19, 2025

View reviewed changes

SamitHuang reviewed Dec 19, 2025

View reviewed changes

examples/offline_inference/image_to_image/image_layered.py Outdated Show resolved Hide resolved

vllm_omni/diffusion/request.py Show resolved Hide resolved

vllm_omni/diffusion/models/qwen_image/pipeline_qwen_image_layered.py Show resolved Hide resolved

Bounty-hunter force-pushed the support_qwen_image_layered branch from 59173fe to 607147c Compare December 20, 2025 13:02

ZJY0516 reviewed Dec 20, 2025

View reviewed changes

vllm_omni/diffusion/diffusion_engine.py Show resolved Hide resolved

Support model qwen image layered

a2d8c99

Signed-off-by: dengyunyang <[email protected]>

Bounty-hunter force-pushed the support_qwen_image_layered branch from 607147c to a2d8c99 Compare December 20, 2025 13:50

hsliuustc0106 added the ready label to trigger buildkite CI label Dec 20, 2025

Bounty-hunter changed the title ~~[WIP] Support model qwen image layered~~ [FEATURE] Support model qwen image layered Dec 20, 2025

hsliuustc0106 changed the title ~~[FEATURE] Support model qwen image layered~~ [New model] Support model qwen image layered Dec 20, 2025

hsliuustc0106 approved these changes Dec 20, 2025

View reviewed changes

hsliuustc0106 enabled auto-merge (squash) December 20, 2025 14:50

hsliuustc0106 merged commit 85bc8bf into vllm-project:main Dec 20, 2025
6 checks passed

tjtanaa reviewed Dec 21, 2025

View reviewed changes

examples/offline_inference/image_to_image/image_edit.py Show resolved Hide resolved

wtomin pushed a commit to wtomin/vllm-omni that referenced this pull request Dec 22, 2025

[New model] Support model qwen image layered (vllm-project#381)

ee8b3f8

Signed-off-by: dengyunyang <[email protected]> Signed-off-by: Didan Deng <[email protected]>

david6666666 mentioned this pull request Dec 22, 2025

[RFC]: DiT model and feature support enhancement #85

Closed

58 tasks

yenuo26 pushed a commit to yenuo26/vllm-omni that referenced this pull request Dec 29, 2025

[New model] Support model qwen image layered (vllm-project#381)

8ae7165

Signed-off-by: dengyunyang <[email protected]> Signed-off-by: wangyu31577 <[email protected]>

Bounty-hunter mentioned this pull request Dec 30, 2025

[Doc] Adding diffusion model #524

Merged

5 tasks

david6666666 mentioned this pull request Jan 9, 2026

[Feature]: vLLM-Omni model owner JiusiServe/vllm-omni#25

Open

17 tasks

princepride pushed a commit to princepride/vllm-omni that referenced this pull request Jan 10, 2026

[New model] Support model qwen image layered (vllm-project#381)

1b3ac06

Signed-off-by: dengyunyang <[email protected]>

david6666666 mentioned this pull request Jan 16, 2026

vLLM-Omni Model Support #808

Open

57 tasks

		@@ -0,0 +1,1054 @@
		# Copyright 2025 The Qwen-Image Team, Wan Team and The HuggingFace Team. All rights reserved.

Conversation

Bounty-hunter commented Dec 19, 2025 • edited by hsliuustc0106 Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Test Plan

Test Result

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector bot Dec 19, 2025

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector bot Dec 19, 2025

Choose a reason for hiding this comment

Uh oh!

hsliuustc0106 commented Dec 19, 2025

Uh oh!

ZJY0516 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ZJY0516 Dec 20, 2025

Choose a reason for hiding this comment

Uh oh!

Bounty-hunter Dec 20, 2025

Choose a reason for hiding this comment

Uh oh!

ZJY0516 Dec 20, 2025

Choose a reason for hiding this comment

Uh oh!

ZJY0516 Dec 20, 2025

Choose a reason for hiding this comment

Uh oh!

hsliuustc0106 Dec 20, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

hsliuustc0106 commented Dec 20, 2025

Uh oh!

Bounty-hunter commented Dec 20, 2025

Uh oh!

hsliuustc0106 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Bounty-hunter commented Dec 19, 2025 •

edited by hsliuustc0106

Loading