Skip to content

Comments

[Bugfix] Fix generation artifacts of Qwen-Image-Edit-2511 and update pipeline DiT param parsing #776

Merged
ZJY0516 merged 9 commits intovllm-project:mainfrom
SamitHuang:fix_pipeline_dit
Jan 15, 2026
Merged

[Bugfix] Fix generation artifacts of Qwen-Image-Edit-2511 and update pipeline DiT param parsing #776
ZJY0516 merged 9 commits intovllm-project:mainfrom
SamitHuang:fix_pipeline_dit

Conversation

@SamitHuang
Copy link
Collaborator

@SamitHuang SamitHuang commented Jan 14, 2026

PLEASE FILL IN THE PR DESCRIPTION HERE ENSURING ALL CHECKLIST ITEMS (AT THE BOTTOM) HAVE BEEN CONSIDERED.

Purpose

Fixed #675

Test Plan

cd vllm-omni/examples/offline_inference/image_to_image
python image_edit.py \
    --model /home/yx/models/Qwen/Qwen-Image-Edit-2511 \
    --image "readme_cn.png" \
    --prompt "Make the girl in the image put her hands down." \
    --output output_image_edit.png \
    --num_inference_steps 50 \

Test Result

Before:

output_seed0

After:
output_image_edit new


Essential Elements of an Effective PR Description Checklist
  • The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
  • The test plan, such as providing test command.
  • The test results, such as pasting the results comparison before and after, or e2e results
  • (Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
  • (Optional) Release notes update. If your change is user facing, please update the release notes draft.

BEFORE SUBMITTING, PLEASE READ https://github.com/vllm-project/vllm-omni/blob/main/CONTRIBUTING.md (anything written below this line will be removed by GitHub Actions)

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 8ac473c1eb

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

@SamitHuang SamitHuang changed the title [Bugfix] Fix Qwen-Image-Edit generation precision and update pipeline DiT param parsing [Bugfix] Fix generation precision in Qwen-Image-Edit-2512 and update pipeline DiT param parsing Jan 14, 2026
@SamitHuang SamitHuang requested a review from ZJY0516 January 14, 2026 04:46
@SamitHuang SamitHuang changed the title [Bugfix] Fix generation precision in Qwen-Image-Edit-2512 and update pipeline DiT param parsing [Bugfix] Fix generation artifacts of Qwen-Image-Edit-2512 and update pipeline DiT param parsing Jan 14, 2026
Signed-off-by: samithuang <[email protected]>
Signed-off-by: samithuang <[email protected]>
Signed-off-by: samithuang <[email protected]>
@SamitHuang SamitHuang changed the title [Bugfix] Fix generation artifacts of Qwen-Image-Edit-2512 and update pipeline DiT param parsing [Bugfix] Fix generation artifacts of Qwen-Image-Edit-2511 and update pipeline DiT param parsing Jan 14, 2026
@david6666666 david6666666 added this to the v0.14.0rc1 milestone Jan 14, 2026
@SamitHuang SamitHuang added the ready label to trigger buildkite CI label Jan 14, 2026
@jiangmengyu18
Copy link
Contributor

jiangmengyu18 commented Jan 14, 2026

@SamitHuang
AdaLayerNorm actually supports modulate_index. It was just omitted during the earlier adaptation. You can fix this bug like:

self.img_norm1 = AdaLayerNorm(dim, elementwise_affine=False, eps=eps)
...

img_modulated, img_gate1 = self.img_norm1(hidden_states, img_mod1, modulate_index)

Copy link
Contributor

@gcanlin gcanlin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Teacache should also change the corresponding parameter.

img_modulated, _ = block.img_norm1(hidden_states, img_mod1)

@ZJY0516 ZJY0516 merged commit 2d5faf3 into vllm-project:main Jan 15, 2026
7 checks passed
@SamitHuang
Copy link
Collaborator Author

Teacache should also change the corresponding parameter.

img_modulated, _ = block.img_norm1(hidden_states, img_mod1)

@yuanheng-zhao Currently qwen-image-edit with tea-cache still haves slight artifacts. I tried to add module_index in tea_cache as well, but get the following errror. Can you PTAL?

[Stage-0] ERROR 01-14 17:23:24 [gpu_worker.py:309]   File "/home/yx/vllm-omni/vllm_omni/diffusion/worker/gpu_worker.py", line 150, in execute_model
[Stage-0] ERROR 01-14 17:23:24 [gpu_worker.py:309]     output = self.pipeline.forward(req)
[Stage-0] ERROR 01-14 17:23:24 [gpu_worker.py:309]              ^^^^^^^^^^^^^^^^^^^^^^^^^^
[Stage-0] ERROR 01-14 17:23:24 [gpu_worker.py:309]   File "/home/yx/vllm-omni/vllm_omni/diffusion/models/qwen_image/pipeline_qwen_image_edit_plus.py", line 809, in forward
[Stage-0] ERROR 01-14 17:23:24 [gpu_worker.py:309]     latents = self.diffuse(
[Stage-0] ERROR 01-14 17:23:24 [gpu_worker.py:309]               ^^^^^^^^^^^^^
[Stage-0] ERROR 01-14 17:23:24 [gpu_worker.py:309]   File "/home/yx/vllm-omni/vllm_omni/diffusion/models/qwen_image/pipeline_qwen_image_edit_plus.py", line 599, in diffuse
[Stage-0] ERROR 01-14 17:23:24 [gpu_worker.py:309]     noise_pred = self.transformer(
[Stage-0] ERROR 01-14 17:23:24 [gpu_worker.py:309]                  ^^^^^^^^^^^^^^^^^
[Stage-0] ERROR 01-14 17:23:24 [gpu_worker.py:309]   File "/home/yx/vllm-omni/.venv/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1775, in _wrapped_call_impl
[Stage-0] ERROR 01-14 17:23:24 [gpu_worker.py:309]     return self._call_impl(*args, **kwargs)
[Stage-0] ERROR 01-14 17:23:24 [gpu_worker.py:309]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[Stage-0] ERROR 01-14 17:23:24 [gpu_worker.py:309]   File "/home/yx/vllm-omni/.venv/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1786, in _call_impl
[Stage-0] ERROR 01-14 17:23:24 [gpu_worker.py:309]     return forward_call(*args, **kwargs)
[Stage-0] ERROR 01-14 17:23:24 [gpu_worker.py:309]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[Stage-0] ERROR 01-14 17:23:24 [gpu_worker.py:309]   File "/home/yx/vllm-omni/vllm_omni/diffusion/hooks.py", line 58, in __call__
[Stage-0] ERROR 01-14 17:23:24 [gpu_worker.py:309]     return registry.dispatch(*args, **kwargs)
[Stage-0] ERROR 01-14 17:23:24 [gpu_worker.py:309]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[Stage-0] ERROR 01-14 17:23:24 [gpu_worker.py:309]   File "/home/yx/vllm-omni/vllm_omni/diffusion/hooks.py", line 97, in dispatch
[Stage-0] ERROR 01-14 17:23:24 [gpu_worker.py:309]     return hook.new_forward(self.module, *args, **kwargs)
[Stage-0] ERROR 01-14 17:23:24 [gpu_worker.py:309]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[Stage-0] ERROR 01-14 17:23:24 [gpu_worker.py:309]   File "/home/yx/vllm-omni/vllm_omni/diffusion/cache/teacache/hook.py", line 161, in new_forward
[Stage-0] ERROR 01-14 17:23:24 [gpu_worker.py:309]     outputs = ctx.run_transformer_blocks()
[Stage-0] ERROR 01-14 17:23:24 [gpu_worker.py:309]               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[Stage-0] ERROR 01-14 17:23:24 [gpu_worker.py:309]   File "/home/yx/vllm-omni/vllm_omni/diffusion/cache/teacache/extractors.py", line 234, in run_transformer_blocks
[Stage-0] ERROR 01-14 17:23:24 [gpu_worker.py:309]     e, h = block(
[Stage-0] ERROR 01-14 17:23:24 [gpu_worker.py:309]            ^^^^^^
[Stage-0] ERROR 01-14 17:23:24 [gpu_worker.py:309]   File "/home/yx/vllm-omni/.venv/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1773, in _wrapped_call_impl
[Stage-0] ERROR 01-14 17:23:24 [gpu_worker.py:309]     return self._compiled_call_impl(*args, **kwargs)  # type: ignore[misc]
[Stage-0] ERROR 01-14 17:23:24 [gpu_worker.py:309]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
...
[Stage-0] ERROR 01-14 17:23:24 [gpu_worker.py:309]   File "/home/yx/vllm-omni/.venv/lib/python3.12/site-packages/torch/_subclasses/fake_tensor.py", line 1376, in __to17:23:24 [300/4802]
[Stage-0] ERROR 01-14 17:23:24 [gpu_worker.py:309]     return self.dispatch(func, types, args, kwargs)
[Stage-0] ERROR 01-14 17:23:24 [gpu_worker.py:309]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[Stage-0] ERROR 01-14 17:23:24 [gpu_worker.py:309]   File "/home/yx/vllm-omni/.venv/lib/python3.12/site-packages/torch/_subclasses/fake_tensor.py", line 2096, in dispatch
[Stage-0] ERROR 01-14 17:23:24 [gpu_worker.py:309]     return self._cached_dispatch_impl(func, types, args, kwargs)
[Stage-0] ERROR 01-14 17:23:24 [gpu_worker.py:309]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[Stage-0] ERROR 01-14 17:23:24 [gpu_worker.py:309]   File "/home/yx/vllm-omni/.venv/lib/python3.12/site-packages/torch/_subclasses/fake_tensor.py", line 1511, in _cached_dispatch_impl
[Stage-0] ERROR 01-14 17:23:24 [gpu_worker.py:309]     output = self._dispatch_impl(func, types, args, kwargs)
[Stage-0] ERROR 01-14 17:23:24 [gpu_worker.py:309]              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[Stage-0] ERROR 01-14 17:23:24 [gpu_worker.py:309]   File "/home/yx/vllm-omni/.venv/lib/python3.12/site-packages/torch/_subclasses/fake_tensor.py", line 2639, in _dispatch_impl
[Stage-0] ERROR 01-14 17:23:24 [gpu_worker.py:309]     decomposition_table[func](*args, **kwargs)
[Stage-0] ERROR 01-14 17:23:24 [gpu_worker.py:309]   File "/home/yx/vllm-omni/.venv/lib/python3.12/site-packages/torch/_prims_common/wrappers.py", line 309, in _fn
[Stage-0] ERROR 01-14 17:23:24 [gpu_worker.py:309]     result = fn(*args, **kwargs)
[Stage-0] ERROR 01-14 17:23:24 [gpu_worker.py:309]              ^^^^^^^^^^^^^^^^^^^
[Stage-0] ERROR 01-14 17:23:24 [gpu_worker.py:309]   File "/home/yx/vllm-omni/.venv/lib/python3.12/site-packages/torch/_compile.py", line 53, in inner
[Stage-0] ERROR 01-14 17:23:24 [gpu_worker.py:309]     return disable_fn(*args, **kwargs)
[Stage-0] ERROR 01-14 17:23:24 [gpu_worker.py:309]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^
[Stage-0] ERROR 01-14 17:23:24 [gpu_worker.py:309]   File "/home/yx/vllm-omni/.venv/lib/python3.12/site-packages/torch/_dynamo/eval_frame.py", line 1044, in _fn
[Stage-0] ERROR 01-14 17:23:24 [gpu_worker.py:309]     return fn(*args, **kwargs)
[Stage-0] ERROR 01-14 17:23:24 [gpu_worker.py:309]            ^^^^^^^^^^^^^^^^^^^
[Stage-0] ERROR 01-14 17:23:24 [gpu_worker.py:309]   File "/home/yx/vllm-omni/.venv/lib/python3.12/site-packages/torch/_prims_common/wrappers.py", line 149, in _fn
[Stage-0] ERROR 01-14 17:23:24 [gpu_worker.py:309]     result = fn(**bound.arguments)
[Stage-0] ERROR 01-14 17:23:24 [gpu_worker.py:309]              ^^^^^^^^^^^^^^^^^^^^^
[Stage-0] ERROR 01-14 17:23:24 [gpu_worker.py:309]   File "/home/yx/vllm-omni/.venv/lib/python3.12/site-packages/torch/_refs/__init__.py", line 2920, in cat
[Stage-0] ERROR 01-14 17:23:24 [gpu_worker.py:309]     return prims.cat(filtered, dim).clone(memory_format=memory_format)
[Stage-0] ERROR 01-14 17:23:24 [gpu_worker.py:309]            ^^^^^^^^^^^^^^^^^^^^^^^^
[Stage-0] ERROR 01-14 17:23:24 [gpu_worker.py:309]   File "/home/yx/vllm-omni/.venv/lib/python3.12/site-packages/torch/_ops.py", line 841, in __call__
[Stage-0] ERROR 01-14 17:23:24 [gpu_worker.py:309]     return self._op(*args, **kwargs)
[Stage-0] ERROR 01-14 17:23:24 [gpu_worker.py:309]            ^^^^^^^^^^^^^^^^^^^^^^^^^
[Stage-0] ERROR 01-14 17:23:24 [gpu_worker.py:309]   File "/home/yx/vllm-omni/.venv/lib/python3.12/site-packages/torch/utils/_stats.py", line 28, in wrapper
[Stage-0] ERROR 01-14 17:23:24 [gpu_worker.py:309]     return fn(*args, **kwargs)
[Stage-0] ERROR 01-14 17:23:24 [gpu_worker.py:309]            ^^^^^^^^^^^^^^^^^^^

[Stage-0] ERROR 01-14 17:23:24 [gpu_worker.py:309]   File "/home/yx/vllm-omni/.venv/lib/python3.12/site-packages/torch/_subclasses/fake_tensor.py", line 1376, in __torch_dispatch__
[Stage-0] ERROR 01-14 17:23:24 [gpu_worker.py:309]     return self.dispatch(func, types, args, kwargs)
[Stage-0] ERROR 01-14 17:23:24 [gpu_worker.py:309]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[Stage-0] ERROR 01-14 17:23:24 [gpu_worker.py:309]   File "/home/yx/vllm-omni/.venv/lib/python3.12/site-packages/torch/_subclasses/fake_tensor.py", line 2096, in dispatch
[Stage-0] ERROR 01-14 17:23:24 [gpu_worker.py:309]     return self._cached_dispatch_impl(func, types, args, kwargs)
[Stage-0] ERROR 01-14 17:23:24 [gpu_worker.py:309]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[Stage-0] ERROR 01-14 17:23:24 [gpu_worker.py:309]   File "/home/yx/vllm-omni/.venv/lib/python3.12/site-packages/torch/_subclasses/fake_tensor.py", line 1511, in _cached_dispatch_impl
[Stage-0] ERROR 01-14 17:23:24 [gpu_worker.py:309]     output = self._dispatch_impl(func, types, args, kwargs)
[Stage-0] ERROR 01-14 17:23:24 [gpu_worker.py:309]              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[Stage-0] ERROR 01-14 17:23:24 [gpu_worker.py:309]   File "/home/yx/vllm-omni/.venv/lib/python3.12/site-packages/torch/_subclasses/fake_tensor.py", line 2661, in _dispatch_impl
[Stage-0] ERROR 01-14 17:23:24 [gpu_worker.py:309]     func.prim_meta_impl(*args, **kwargs)
[Stage-0] ERROR 01-14 17:23:24 [gpu_worker.py:309]   File "/home/yx/vllm-omni/.venv/lib/python3.12/site-packages/torch/_prims/__init__.py", line 1796, in _cat_meta
[Stage-0] ERROR 01-14 17:23:24 [gpu_worker.py:309]     torch._check(
[Stage-0] ERROR 01-14 17:23:24 [gpu_worker.py:309]   File "/home/yx/vllm-omni/.venv/lib/python3.12/site-packages/torch/__init__.py", line 1695, in _check
[Stage-0] ERROR 01-14 17:23:24 [gpu_worker.py:309]     _check_with(RuntimeError, cond, message)
[Stage-0] ERROR 01-14 17:23:24 [gpu_worker.py:309]   File "/home/yx/vllm-omni/.venv/lib/python3.12/site-packages/torch/__init__.py", line 1677, in _check_with
[Stage-0] ERROR 01-14 17:23:24 [gpu_worker.py:309]     raise error_type(message_evaluated)
[Stage-0] ERROR 01-14 17:23:24 [gpu_worker.py:309] torch._dynamo.exc.TorchRuntimeError: Dynamo failed to run FX node with fake tensors: call_function <built-in method cat of type object
 at 0x7fffe99e1c40>(*([FakeTensor(..., device='cuda:0', size=(1, s31, 24, 128), dtype=torch.bfloat16), FakeTensor(..., device='cuda:0', size=(0, s87, 24, 128), dtype=torch.bfloat16)],),
 **{'dim': 1}): got RuntimeError('Sizes of tensors must match except in dimension 1. Expected 1 in dimension 0 but got 0 for tensor number 1 in the list')
[Stage-0] ERROR 01-14 17:23:24 [gpu_worker.py:309]
[Stage-0] ERROR 01-14 17:23:24 [gpu_worker.py:309] from user code:
[Stage-0] ERROR 01-14 17:23:24 [gpu_worker.py:309]    File "/home/yx/vllm-omni/vllm_omni/diffusion/models/qwen_image/qwen_image_transformer.py", line 423, in torch_dynamo_resume_in_forw
ard_at_384
[Stage-0] ERROR 01-14 17:23:24 [gpu_worker.py:309]     joint_query = torch.cat([txt_query, img_query], dim=1)
[Stage-0] ERROR 01-14 17:23:24 [gpu_worker.py:309]

@ZJY0516
Copy link
Collaborator

ZJY0516 commented Jan 15, 2026

torch._dynamo.exc.TorchRuntimeError: Dynamo failed to run FX node with fake tensors: call_function <built-in method cat of type object
at 0x7fffe99e1c40>(*([FakeTensor(..., device='cuda:0', size=(1, s31, 24, 128), dtype=torch.bfloat16), FakeTensor(..., device='cuda:0', size=(0, s87, 24, 128), dtype=torch.bfloat16)],),
**{'dim': 1}): got RuntimeError('Sizes of tensors must match except in dimension 1. Expected 1 in dimension 0 but got 0 for tensor number 1 in the list')

perhaps it's related to torch compile

GG-li pushed a commit to GG-li/vllm-omni that referenced this pull request Jan 15, 2026
GG-li pushed a commit to GG-li/vllm-omni that referenced this pull request Jan 15, 2026
@yuanheng-zhao
Copy link
Contributor

yuanheng-zhao commented Jan 16, 2026

Teacache should also change the corresponding parameter.

img_modulated, _ = block.img_norm1(hidden_states, img_mod1)

@yuanheng-zhao Currently qwen-image-edit with tea-cache still haves slight artifacts. I tried to add module_index in tea_cache as well, but get the following errror. Can you PTAL?

[Stage-0] ERROR 01-14 17:23:24 [gpu_worker.py:309]   File "/home/yx/vllm-omni/vllm_omni/diffusion/worker/gpu_worker.py", line 150, in execute_model
[Stage-0] ERROR 01-14 17:23:24 [gpu_worker.py:309]     output = self.pipeline.forward(req)
[Stage-0] ERROR 01-14 17:23:24 [gpu_worker.py:309]              ^^^^^^^^^^^^^^^^^^^^^^^^^^
[Stage-0] ERROR 01-14 17:23:24 [gpu_worker.py:309]   File "/home/yx/vllm-omni/vllm_omni/diffusion/models/qwen_image/pipeline_qwen_image_edit_plus.py", line 809, in forward
[Stage-0] ERROR 01-14 17:23:24 [gpu_worker.py:309]     latents = self.diffuse(
[Stage-0] ERROR 01-14 17:23:24 [gpu_worker.py:309]               ^^^^^^^^^^^^^
[Stage-0] ERROR 01-14 17:23:24 [gpu_worker.py:309]   File "/home/yx/vllm-omni/vllm_omni/diffusion/models/qwen_image/pipeline_qwen_image_edit_plus.py", line 599, in diffuse
[Stage-0] ERROR 01-14 17:23:24 [gpu_worker.py:309]     noise_pred = self.transformer(
[Stage-0] ERROR 01-14 17:23:24 [gpu_worker.py:309]                  ^^^^^^^^^^^^^^^^^
[Stage-0] ERROR 01-14 17:23:24 [gpu_worker.py:309]   File "/home/yx/vllm-omni/.venv/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1775, in _wrapped_call_impl
[Stage-0] ERROR 01-14 17:23:24 [gpu_worker.py:309]     return self._call_impl(*args, **kwargs)
[Stage-0] ERROR 01-14 17:23:24 [gpu_worker.py:309]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[Stage-0] ERROR 01-14 17:23:24 [gpu_worker.py:309]   File "/home/yx/vllm-omni/.venv/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1786, in _call_impl
[Stage-0] ERROR 01-14 17:23:24 [gpu_worker.py:309]     return forward_call(*args, **kwargs)
[Stage-0] ERROR 01-14 17:23:24 [gpu_worker.py:309]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[Stage-0] ERROR 01-14 17:23:24 [gpu_worker.py:309]   File "/home/yx/vllm-omni/vllm_omni/diffusion/hooks.py", line 58, in __call__
[Stage-0] ERROR 01-14 17:23:24 [gpu_worker.py:309]     return registry.dispatch(*args, **kwargs)
[Stage-0] ERROR 01-14 17:23:24 [gpu_worker.py:309]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[Stage-0] ERROR 01-14 17:23:24 [gpu_worker.py:309]   File "/home/yx/vllm-omni/vllm_omni/diffusion/hooks.py", line 97, in dispatch
[Stage-0] ERROR 01-14 17:23:24 [gpu_worker.py:309]     return hook.new_forward(self.module, *args, **kwargs)
[Stage-0] ERROR 01-14 17:23:24 [gpu_worker.py:309]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[Stage-0] ERROR 01-14 17:23:24 [gpu_worker.py:309]   File "/home/yx/vllm-omni/vllm_omni/diffusion/cache/teacache/hook.py", line 161, in new_forward
[Stage-0] ERROR 01-14 17:23:24 [gpu_worker.py:309]     outputs = ctx.run_transformer_blocks()
[Stage-0] ERROR 01-14 17:23:24 [gpu_worker.py:309]               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[Stage-0] ERROR 01-14 17:23:24 [gpu_worker.py:309]   File "/home/yx/vllm-omni/vllm_omni/diffusion/cache/teacache/extractors.py", line 234, in run_transformer_blocks
[Stage-0] ERROR 01-14 17:23:24 [gpu_worker.py:309]     e, h = block(
[Stage-0] ERROR 01-14 17:23:24 [gpu_worker.py:309]            ^^^^^^
[Stage-0] ERROR 01-14 17:23:24 [gpu_worker.py:309]   File "/home/yx/vllm-omni/.venv/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1773, in _wrapped_call_impl
[Stage-0] ERROR 01-14 17:23:24 [gpu_worker.py:309]     return self._compiled_call_impl(*args, **kwargs)  # type: ignore[misc]
[Stage-0] ERROR 01-14 17:23:24 [gpu_worker.py:309]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
...
[Stage-0] ERROR 01-14 17:23:24 [gpu_worker.py:309]   File "/home/yx/vllm-omni/.venv/lib/python3.12/site-packages/torch/_subclasses/fake_tensor.py", line 1376, in __to17:23:24 [300/4802]
[Stage-0] ERROR 01-14 17:23:24 [gpu_worker.py:309]     return self.dispatch(func, types, args, kwargs)
[Stage-0] ERROR 01-14 17:23:24 [gpu_worker.py:309]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[Stage-0] ERROR 01-14 17:23:24 [gpu_worker.py:309]   File "/home/yx/vllm-omni/.venv/lib/python3.12/site-packages/torch/_subclasses/fake_tensor.py", line 2096, in dispatch
[Stage-0] ERROR 01-14 17:23:24 [gpu_worker.py:309]     return self._cached_dispatch_impl(func, types, args, kwargs)
[Stage-0] ERROR 01-14 17:23:24 [gpu_worker.py:309]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[Stage-0] ERROR 01-14 17:23:24 [gpu_worker.py:309]   File "/home/yx/vllm-omni/.venv/lib/python3.12/site-packages/torch/_subclasses/fake_tensor.py", line 1511, in _cached_dispatch_impl
[Stage-0] ERROR 01-14 17:23:24 [gpu_worker.py:309]     output = self._dispatch_impl(func, types, args, kwargs)
[Stage-0] ERROR 01-14 17:23:24 [gpu_worker.py:309]              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[Stage-0] ERROR 01-14 17:23:24 [gpu_worker.py:309]   File "/home/yx/vllm-omni/.venv/lib/python3.12/site-packages/torch/_subclasses/fake_tensor.py", line 2639, in _dispatch_impl
[Stage-0] ERROR 01-14 17:23:24 [gpu_worker.py:309]     decomposition_table[func](*args, **kwargs)
[Stage-0] ERROR 01-14 17:23:24 [gpu_worker.py:309]   File "/home/yx/vllm-omni/.venv/lib/python3.12/site-packages/torch/_prims_common/wrappers.py", line 309, in _fn
[Stage-0] ERROR 01-14 17:23:24 [gpu_worker.py:309]     result = fn(*args, **kwargs)
[Stage-0] ERROR 01-14 17:23:24 [gpu_worker.py:309]              ^^^^^^^^^^^^^^^^^^^
[Stage-0] ERROR 01-14 17:23:24 [gpu_worker.py:309]   File "/home/yx/vllm-omni/.venv/lib/python3.12/site-packages/torch/_compile.py", line 53, in inner
[Stage-0] ERROR 01-14 17:23:24 [gpu_worker.py:309]     return disable_fn(*args, **kwargs)
[Stage-0] ERROR 01-14 17:23:24 [gpu_worker.py:309]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^
[Stage-0] ERROR 01-14 17:23:24 [gpu_worker.py:309]   File "/home/yx/vllm-omni/.venv/lib/python3.12/site-packages/torch/_dynamo/eval_frame.py", line 1044, in _fn
[Stage-0] ERROR 01-14 17:23:24 [gpu_worker.py:309]     return fn(*args, **kwargs)
[Stage-0] ERROR 01-14 17:23:24 [gpu_worker.py:309]            ^^^^^^^^^^^^^^^^^^^
[Stage-0] ERROR 01-14 17:23:24 [gpu_worker.py:309]   File "/home/yx/vllm-omni/.venv/lib/python3.12/site-packages/torch/_prims_common/wrappers.py", line 149, in _fn
[Stage-0] ERROR 01-14 17:23:24 [gpu_worker.py:309]     result = fn(**bound.arguments)
[Stage-0] ERROR 01-14 17:23:24 [gpu_worker.py:309]              ^^^^^^^^^^^^^^^^^^^^^
[Stage-0] ERROR 01-14 17:23:24 [gpu_worker.py:309]   File "/home/yx/vllm-omni/.venv/lib/python3.12/site-packages/torch/_refs/__init__.py", line 2920, in cat
[Stage-0] ERROR 01-14 17:23:24 [gpu_worker.py:309]     return prims.cat(filtered, dim).clone(memory_format=memory_format)
[Stage-0] ERROR 01-14 17:23:24 [gpu_worker.py:309]            ^^^^^^^^^^^^^^^^^^^^^^^^
[Stage-0] ERROR 01-14 17:23:24 [gpu_worker.py:309]   File "/home/yx/vllm-omni/.venv/lib/python3.12/site-packages/torch/_ops.py", line 841, in __call__
[Stage-0] ERROR 01-14 17:23:24 [gpu_worker.py:309]     return self._op(*args, **kwargs)
[Stage-0] ERROR 01-14 17:23:24 [gpu_worker.py:309]            ^^^^^^^^^^^^^^^^^^^^^^^^^
[Stage-0] ERROR 01-14 17:23:24 [gpu_worker.py:309]   File "/home/yx/vllm-omni/.venv/lib/python3.12/site-packages/torch/utils/_stats.py", line 28, in wrapper
[Stage-0] ERROR 01-14 17:23:24 [gpu_worker.py:309]     return fn(*args, **kwargs)
[Stage-0] ERROR 01-14 17:23:24 [gpu_worker.py:309]            ^^^^^^^^^^^^^^^^^^^

[Stage-0] ERROR 01-14 17:23:24 [gpu_worker.py:309]   File "/home/yx/vllm-omni/.venv/lib/python3.12/site-packages/torch/_subclasses/fake_tensor.py", line 1376, in __torch_dispatch__
[Stage-0] ERROR 01-14 17:23:24 [gpu_worker.py:309]     return self.dispatch(func, types, args, kwargs)
[Stage-0] ERROR 01-14 17:23:24 [gpu_worker.py:309]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[Stage-0] ERROR 01-14 17:23:24 [gpu_worker.py:309]   File "/home/yx/vllm-omni/.venv/lib/python3.12/site-packages/torch/_subclasses/fake_tensor.py", line 2096, in dispatch
[Stage-0] ERROR 01-14 17:23:24 [gpu_worker.py:309]     return self._cached_dispatch_impl(func, types, args, kwargs)
[Stage-0] ERROR 01-14 17:23:24 [gpu_worker.py:309]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[Stage-0] ERROR 01-14 17:23:24 [gpu_worker.py:309]   File "/home/yx/vllm-omni/.venv/lib/python3.12/site-packages/torch/_subclasses/fake_tensor.py", line 1511, in _cached_dispatch_impl
[Stage-0] ERROR 01-14 17:23:24 [gpu_worker.py:309]     output = self._dispatch_impl(func, types, args, kwargs)
[Stage-0] ERROR 01-14 17:23:24 [gpu_worker.py:309]              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[Stage-0] ERROR 01-14 17:23:24 [gpu_worker.py:309]   File "/home/yx/vllm-omni/.venv/lib/python3.12/site-packages/torch/_subclasses/fake_tensor.py", line 2661, in _dispatch_impl
[Stage-0] ERROR 01-14 17:23:24 [gpu_worker.py:309]     func.prim_meta_impl(*args, **kwargs)
[Stage-0] ERROR 01-14 17:23:24 [gpu_worker.py:309]   File "/home/yx/vllm-omni/.venv/lib/python3.12/site-packages/torch/_prims/__init__.py", line 1796, in _cat_meta
[Stage-0] ERROR 01-14 17:23:24 [gpu_worker.py:309]     torch._check(
[Stage-0] ERROR 01-14 17:23:24 [gpu_worker.py:309]   File "/home/yx/vllm-omni/.venv/lib/python3.12/site-packages/torch/__init__.py", line 1695, in _check
[Stage-0] ERROR 01-14 17:23:24 [gpu_worker.py:309]     _check_with(RuntimeError, cond, message)
[Stage-0] ERROR 01-14 17:23:24 [gpu_worker.py:309]   File "/home/yx/vllm-omni/.venv/lib/python3.12/site-packages/torch/__init__.py", line 1677, in _check_with
[Stage-0] ERROR 01-14 17:23:24 [gpu_worker.py:309]     raise error_type(message_evaluated)
[Stage-0] ERROR 01-14 17:23:24 [gpu_worker.py:309] torch._dynamo.exc.TorchRuntimeError: Dynamo failed to run FX node with fake tensors: call_function <built-in method cat of type object
 at 0x7fffe99e1c40>(*([FakeTensor(..., device='cuda:0', size=(1, s31, 24, 128), dtype=torch.bfloat16), FakeTensor(..., device='cuda:0', size=(0, s87, 24, 128), dtype=torch.bfloat16)],),
 **{'dim': 1}): got RuntimeError('Sizes of tensors must match except in dimension 1. Expected 1 in dimension 0 but got 0 for tensor number 1 in the list')
[Stage-0] ERROR 01-14 17:23:24 [gpu_worker.py:309]
[Stage-0] ERROR 01-14 17:23:24 [gpu_worker.py:309] from user code:
[Stage-0] ERROR 01-14 17:23:24 [gpu_worker.py:309]    File "/home/yx/vllm-omni/vllm_omni/diffusion/models/qwen_image/qwen_image_transformer.py", line 423, in torch_dynamo_resume_in_forw
ard_at_384
[Stage-0] ERROR 01-14 17:23:24 [gpu_worker.py:309]     joint_query = torch.cat([txt_query, img_query], dim=1)
[Stage-0] ERROR 01-14 17:23:24 [gpu_worker.py:309]

Hey @SamitHuang , may I get the command you tested and got the torch compile exception? I added module_index in extract_qwen_context and tested both image edit (Qwen-Image-Edit) and multiple image edit (Qwen-Image-Edit-2509) with teacache enabled but didn't reproduce the error.

erfgss pushed a commit to erfgss/vllm-omni that referenced this pull request Jan 19, 2026
…pipeline DiT param parsing (vllm-project#776)

Signed-off-by: samithuang <[email protected]>
Signed-off-by: Chen Yang <[email protected]>
with1015 pushed a commit to with1015/vllm-omni that referenced this pull request Jan 20, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ready label to trigger buildkite CI

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]: Qwen-Image-Edit-2511 Inference Results Are Abnormal on Ascend NPU

7 participants