[Bugfix] fix qwen image layerd in dummy run by ZJY0516 · Pull Request #1027 · vllm-project/vllm-omni

ZJY0516 · 2026-01-28T10:51:32Z

Purpose

add color_format for dummy run

Test

python image_edit.py --model "Qwen/Qwen-Image-Layered" --image ./qwen_image_output.png  --prompt "turn this coffee cup to a glass of wine'"   --output output_image_edit.png   --num_inference_steps 50  --cfg_scale 4.0 --layers 2 --color-format "RGBA" --output "layered"
WARNING 01-28 18:51:55 [mooncake_connector.py:18] Mooncake not available, MooncakeOmniConnector will not work
WARNING 01-28 18:51:55 [yuanrong_connector.py:14] Datasystem not available, YuanrongConnector will not work
INFO 01-28 18:51:55 [omni.py:119] Initializing stages for model: Qwen/Qwen-Image-Layered
INFO 01-28 18:51:56 [initialization.py:35] No OmniTransferConfig provided
INFO 01-28 18:51:56 [omni_stage.py:84] [OmniStage] stage_config: {'stage_id': 0, 'stage_type': 'diffusion', 'runtime': {'process': True, 'devices': '0', 'max_batch_size': 1}, 'engine_args': {'vae_use_slicing': False, 'vae_use_tiling': False, 'cache_backend': None, 'cache_config': None, 'parallel_config': {'pipeline_parallel_size': 1, 'data_parallel_size': 1, 'tensor_parallel_size': 1, 'sequence_parallel_size': 1, 'ulysses_degree': 1, 'ring_degree': 1, 'cfg_parallel_size': 1}, 'enforce_eager': False, 'enable_cpu_offload': False, 'model': 'Qwen/Qwen-Image-Layered', 'model_stage': 'diffusion'}, 'final_output': True, 'final_output_type': 'image'}
INFO 01-28 18:51:56 [omni.py:338] [Orchestrator] Waiting for 1 stages to initialize (timeout: 300s)
[Stage-0] WARNING 01-28 18:52:02 [mooncake_connector.py:18] Mooncake not available, MooncakeOmniConnector will not work
[Stage-0] WARNING 01-28 18:52:02 [yuanrong_connector.py:14] Datasystem not available, YuanrongConnector will not work
[Stage-0] INFO 01-28 18:52:02 [omni_stage.py:481] Starting stage worker with model: Qwen/Qwen-Image-Layered
[Stage-0] INFO 01-28 18:52:02 [omni_stage.py:491] [Stage] Set VLLM_WORKER_MULTIPROC_METHOD=spawn
[Stage-0] INFO 01-28 18:52:03 [weight_utils.py:46] Using model weights format ['*']
[Stage-0] INFO 01-28 18:52:04 [weight_utils.py:67] Time spent downloading weights for Qwen/Qwen-Image-Layered: 0.567695 seconds
[Stage-0] INFO 01-28 18:52:04 [multiproc_executor.py:74] Starting server...
[Stage-0] WARNING 01-28 18:52:10 [mooncake_connector.py:18] Mooncake not available, MooncakeOmniConnector will not work
[Stage-0] WARNING 01-28 18:52:10 [yuanrong_connector.py:14] Datasystem not available, YuanrongConnector will not work
[Stage-0] INFO 01-28 18:52:11 [gpu_diffusion_worker.py:266] Worker 0 created result MessageQueue
[Stage-0] INFO 01-28 18:52:11 [scheduler.py:229] Chunked prefill is enabled with max_num_batched_tokens=2048.
[Stage-0] INFO 01-28 18:52:11 [vllm.py:630] Asynchronous scheduling is enabled.
[Stage-0] INFO 01-28 18:52:11 [vllm.py:637] Disabling NCCL for DP synchronization when using async scheduling.
[Gloo] Rank 0 is connected to 0 peer ranks. Expected number of connected peer ranks is : 0
[Stage-0] INFO 01-28 18:52:11 [gpu_diffusion_worker.py:94] Worker 0: Initialized device and distributed environment.
[Gloo] Rank 0 is connected to 0 peer ranks. Expected number of connected peer ranks is : 0
[Gloo] Rank 0 is connected to 0 peer ranks. Expected number of connected peer ranks is : 0
[Gloo] Rank 0 is connected to 0 peer ranks. Expected number of connected peer ranks is : 0
[Gloo] Rank 0 is connected to 0 peer ranks. Expected number of connected peer ranks is : 0
[Gloo] Rank 0 is connected to 0 peer ranks. Expected number of connected peer ranks is : 0
[Stage-0] WARNING 01-28 18:52:11 [gpu_diffusion_model_runner.py:152] No OmniConnector config found, skipping initialization
Loading checkpoint shards: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████| 4/4 [00:02<00:00,  1.36it/s]
The image processor of type `Qwen2VLImageProcessor` is now loaded as a fast processor by default, even if the model checkpoint was saved with a slow processor. This is a breaking change and may produce slightly different outputs. To continue using the slow processor, instantiate this class with `use_fast=False`. Note that this behavior will be extended to all models in a future release.
[Stage-0] INFO 01-28 18:52:28 [selector.py:114] Using attention backend 'FLASH_ATTN' for diffusion
Loading safetensors checkpoint shards:   0% Completed | 0/5 [00:00<?, ?it/s]
Loading safetensors checkpoint shards:  20% Completed | 1/5 [00:00<00:00,  7.33it/s]
Loading safetensors checkpoint shards:  40% Completed | 2/5 [00:01<00:03,  1.01s/it]
Loading safetensors checkpoint shards:  60% Completed | 3/5 [00:03<00:02,  1.47s/it]
Loading safetensors checkpoint shards:  80% Completed | 4/5 [00:05<00:01,  1.71s/it]
Loading safetensors checkpoint shards: 100% Completed | 5/5 [00:08<00:00,  1.88s/it]
Loading safetensors checkpoint shards: 100% Completed | 5/5 [00:08<00:00,  1.61s/it]

[Stage-0] INFO 01-28 18:52:38 [diffusers_loader.py:227] Loading weights took 8.49 seconds
[Stage-0] INFO 01-28 18:52:39 [gpu_diffusion_model_runner.py:101] Model loading took 53.7462 GiB and 27.483816 seconds
[Stage-0] INFO 01-28 18:52:39 [gpu_diffusion_model_runner.py:106] Model runner: Model loaded successfully.
[Stage-0] INFO 01-28 18:52:39 [gpu_diffusion_model_runner.py:128] Model runner: Model compiled with torch.compile.
[Stage-0] INFO 01-28 18:52:39 [gpu_diffusion_model_runner.py:138] Model runner: Initialization complete.
[Stage-0] INFO 01-28 18:52:39 [manager.py:90] Initializing DiffusionLoRAManager: device=cuda:0, dtype=torch.bfloat16, max_cached_adapters=1, static_lora_path=None
[Stage-0] INFO 01-28 18:52:39 [gpu_diffusion_worker.py:125] Worker 0: Initialization complete.
[Stage-0] INFO 01-28 18:52:39 [gpu_diffusion_worker.py:387] Worker 0: Scheduler loop started.
[Stage-0] INFO 01-28 18:52:39 [gpu_diffusion_worker.py:317] Worker 0 ready to receive requests via shared memory
[Stage-0] INFO 01-28 18:52:39 [scheduler.py:39] SyncScheduler initialized result MessageQueue
[Stage-0] INFO 01-28 18:52:39 [diffusion_engine.py:337] dummy run to warm up the model
[Stage-0] INFO 01-28 18:52:39 [manager.py:538] Deactivating all adapters: 0 layers
[Stage-0] WARNING 01-28 18:52:39 [pipeline_qwen_image_layered.py:833] true_cfg_scale is passed as 4.0, but classifier-free guidance is not enabled since no negative_prompt is provided.
[Stage-0] INFO 01-28 18:52:45 [omni_stage.py:709] Max batch size: 1
INFO 01-28 18:52:45 [omni.py:331] [Orchestrator] Stage-0 reported ready
INFO 01-28 18:52:45 [omni.py:357] [Orchestrator] All stages initialized successfully
Pipeline loaded

============================================================
Generation Configuration:
  Model: Qwen/Qwen-Image-Layered
  Inference steps: 50
  Cache backend: None (no acceleration)
  Input image size: (1024, 1024)
  Parallel configuration: ulysses_degree=1, ring_degree=1, cfg_parallel_size=1, tensor_parallel_size=1
============================================================

Adding requests:   0%|                                                                                                                         | 0/1 [00:00<?, ?it/s[Stage-0] INFO 01-28 18:52:45 [diffusion_engine.py:75] Pre-processing completed in 0.0291 seconds [00:00<?, ?it/s, est. speed input: 0.00 unit/s, output: 0.00 unit/s]
[Stage-0] INFO 01-28 18:52:45 [manager.py:538] Deactivating all adapters: 0 layers
[Stage-0] WARNING 01-28 18:52:45 [pipeline_qwen_image_layered.py:833] true_cfg_scale is passed as 4.0, but classifier-free guidance is not enabled since no negative_prompt is provided.
[Stage-0] WARNING 01-28 18:52:45 [pipeline_qwen_image_layered.py:905] guidance_scale is passed as 1.0, but ignored since the model is not guidance-distilled.
[Stage-0] INFO 01-28 18:53:30 [diffusion_engine.py:80] Generation completed successfully.
[Stage-0] INFO 01-28 18:53:30 [diffusion_engine.py:98] Post-processing completed in 0.0000 seconds
INFO 01-28 18:53:30 [log_utils.py:550] {'type': 'request_level_metrics',
INFO 01-28 18:53:30 [log_utils.py:550]  'request_id': '0_77b1a21f-c089-41c4-b8d4-230bac0c6f23',
INFO 01-28 18:53:30 [log_utils.py:550]  'e2e_time_ms': 45834.9244594574,
INFO 01-28 18:53:30 [log_utils.py:550]  'e2e_tpt': 0.0,
INFO 01-28 18:53:30 [log_utils.py:550]  'e2e_total_tokens': 0,
INFO 01-28 18:53:30 [log_utils.py:550]  'transfers_total_time_ms': 0.0,
INFO 01-28 18:53:30 [log_utils.py:550]  'transfers_total_bytes': 0,
INFO 01-28 18:53:30 [log_utils.py:550]  'stages': {0: {'stage_gen_time_ms': 45733.11948776245,
INFO 01-28 18:53:30 [log_utils.py:550]                 'num_tokens_out': 0,
INFO 01-28 18:53:30 [log_utils.py:550]                 'num_tokens_in': 0}}}
Processed prompts: 100%|██████████████████████████████████████████████████████████| 1/1 [00:45<00:00, 45.83s/img, est. speed stage-0 img/s: 0.00, avg e2e_lat: 0.0ms]
INFO 01-28 18:53:30 [omni.py:860] [Summary] {'e2e_requests': 1,███████████████████| 1/1 [00:45<00:00, 45.83s/img, est. speed stage-0 img/s: 0.00, avg e2e_lat: 0.0ms]
INFO 01-28 18:53:30 [omni.py:860]  'e2e_total_time_ms': 45836.013317108154,
INFO 01-28 18:53:30 [omni.py:860]  'e2e_sum_time_ms': 45834.9244594574,
INFO 01-28 18:53:30 [omni.py:860]  'e2e_total_tokens': 0,
INFO 01-28 18:53:30 [omni.py:860]  'e2e_avg_time_per_request_ms': 45834.9244594574,
INFO 01-28 18:53:30 [omni.py:860]  'e2e_avg_tokens_per_s': 0.0,
INFO 01-28 18:53:30 [omni.py:860]  'wall_time_ms': 45836.013317108154,
INFO 01-28 18:53:30 [omni.py:860]  'final_stage_id': {'0_77b1a21f-c089-41c4-b8d4-230bac0c6f23': 0},
INFO 01-28 18:53:30 [omni.py:860]  'stages': [{'stage_id': 0,
INFO 01-28 18:53:30 [omni.py:860]              'requests': 1,
INFO 01-28 18:53:30 [omni.py:860]              'tokens': 0,
INFO 01-28 18:53:30 [omni.py:860]              'total_time_ms': 45835.47401428223,
INFO 01-28 18:53:30 [omni.py:860]              'avg_time_per_request_ms': 45835.47401428223,
INFO 01-28 18:53:30 [omni.py:860]              'avg_tokens_per_s': 0.0}],
INFO 01-28 18:53:30 [omni.py:860]  'transfers': []}
Adding requests:   0%|                                                                                                                         | 0/1 [00:45<?, ?it/s]
Total generation time: 45.8371 seconds (45837.14 ms)
Saved edited image to /home/zjy/code/vllm-omni/examples/offline_inference/image_to_image/layered_0.png
Saved edited image to /home/zjy/code/vllm-omni/examples/offline_inference/image_to_image/layered_1.png
[Stage-0] INFO 01-28 18:53:31 [omni_stage.py:757] Received shutdown signal
[Stage-0] INFO 01-28 18:53:31 [gpu_diffusion_worker.py:346] Worker 0: Received shutdown message
[Stage-0] INFO 01-28 18:53:31 [gpu_diffusion_worker.py:367] event loop terminated.
[Stage-0] INFO 01-28 18:53:31 [gpu_diffusion_worker.py:395] Worker 0: Shutdown complete.

Essential Elements of an Effective PR Description Checklist

The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
The test plan, such as providing test command.
The test results, such as pasting the results comparison before and after, or e2e results
(Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
(Optional) Release notes update. If your change is user facing, please update the release notes draft.

BEFORE SUBMITTING, PLEASE READ https://github.com/vllm-project/vllm-omni/blob/main/CONTRIBUTING.md (anything written below this line will be removed by GitHub Actions)

Signed-off-by: zjy0516 <[email protected]>

Copilot

Pull request overview

This PR fixes issue #1002 where the Qwen/Qwen-Image-Layered model throws an error during the dummy run phase. The error occurred because the model requires RGBA format images, but the dummy run was creating RGB images. The fix introduces a color_format attribute system that allows models to specify their required image color format.

Changes:

Added color_format class variable to the SupportImageInput protocol with a default value of "RGB"
Set color_format = "RGBA" specifically for QwenImageLayeredPipeline
Implemented image_color_format() function to retrieve the color format for a model and updated the dummy run to use it
Cleaned up unused logger import and logging statement in the example script

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.

File	Description
`vllm_omni/diffusion/models/interface.py`	Added default `color_format` class variable to `SupportImageInput` protocol
`vllm_omni/diffusion/models/qwen_image/pipeline_qwen_image_layered.py`	Set `color_format = "RGBA"` for the layered pipeline
`vllm_omni/diffusion/diffusion_engine.py`	Added `image_color_format()` function and updated dummy run to use the appropriate color format
`examples/offline_inference/image_to_image/image_edit.py`	Removed unused logger import and logging statement

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

vllm_omni/diffusion/diffusion_engine.py

fhfuih

I think the current fix is good for now. Thanks

wtomin · 2026-01-29T07:04:34Z

LGTM.

Signed-off-by: zjy0516 <[email protected]>

init

c696c4c

Signed-off-by: zjy0516 <[email protected]>

ZJY0516 requested a review from hsliuustc0106 as a code owner January 28, 2026 10:51

Merge branch 'main' into fix-qwen-image-layerd

85e1a68

hsliuustc0106 requested a review from Copilot January 28, 2026 15:09

Copilot started reviewing on behalf of hsliuustc0106 January 28, 2026 15:19 View session

Copilot AI reviewed Jan 28, 2026

View reviewed changes

vllm_omni/diffusion/diffusion_engine.py Show resolved Hide resolved

vllm_omni/diffusion/diffusion_engine.py Show resolved Hide resolved

fhfuih approved these changes Jan 29, 2026

View reviewed changes

SamitHuang approved these changes Jan 29, 2026

View reviewed changes

SamitHuang added the ready label to trigger buildkite CI label Jan 29, 2026

hsliuustc0106 merged commit 43c6f52 into vllm-project:main Jan 29, 2026
13 checks passed

dongbo910220 pushed a commit to dongbo910220/vllm-omni that referenced this pull request Feb 1, 2026

[Bugfix] fix qwen image layerd in dummy run (vllm-project#1027)

e835628

Signed-off-by: zjy0516 <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Comments

[Bugfix] fix qwen image layerd in dummy run#1027

[Bugfix] fix qwen image layerd in dummy run#1027
hsliuustc0106 merged 2 commits intovllm-project:mainfrom
ZJY0516:fix-qwen-image-layerd

ZJY0516 commented Jan 28, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

fhfuih left a comment

Uh oh!

wtomin commented Jan 29, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Comments

Conversation

ZJY0516 commented Jan 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Test

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

fhfuih left a comment

Choose a reason for hiding this comment

Uh oh!

wtomin commented Jan 29, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

ZJY0516 commented Jan 28, 2026 •

edited

Loading