Adding requests: 0%| | 0/1 [00:13<?, ?it/s]
[Stage-0] ERROR 01-03 17:00:34 [omni_stage.py:636] Received shutdown signal
[Stage-0] INFO 01-03 17:00:34 [gpu_worker.py:265] Worker 0: Received shutdown message
[Stage-0] INFO 01-03 17:00:34 [gpu_worker.py:287] event loop terminated.
[Stage-0] ERROR 01-03 17:00:34 [gpu_worker.py:226] Error executing RPC: Tensors must be contiguous
[Stage-0] ERROR 01-03 17:00:34 [gpu_worker.py:226] [ERROR] 2026-01-03-17:00:34 (PID:1242551, Device:1, RankID:1) ERR02002 DIST invalid type
[Stage-0] ERROR 01-03 17:00:34 [gpu_worker.py:226] Traceback (most recent call last):
[Stage-0] ERROR 01-03 17:00:34 [gpu_worker.py:226] File "/root/vllm-workspace/vllm-omni/vllm_omni/diffusion/worker/gpu_worker.py", line 221, in execute_rpc
[Stage-0] ERROR 01-03 17:00:34 [gpu_worker.py:226] result = func(*args, **kwargs)
[Stage-0] ERROR 01-03 17:00:34 [gpu_worker.py:226] ^^^^^^^^^^^^^^^^^^^^^
[Stage-0] ERROR 01-03 17:00:34 [gpu_worker.py:226] File "/root/vllm-workspace/vllm-omni/vllm_omni/diffusion/worker/gpu_worker.py", line 120, in generate
[Stage-0] ERROR 01-03 17:00:34 [gpu_worker.py:226] return self.execute_model(requests, self.od_config)
[Stage-0] ERROR 01-03 17:00:34 [gpu_worker.py:226] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[Stage-0] ERROR 01-03 17:00:34 [gpu_worker.py:226] File "/usr/local/python3.11.13/lib/python3.11/site-packages/torch/utils/_contextlib.py", line 120, in decorate_context
[Stage-0] ERROR 01-03 17:00:34 [gpu_worker.py:226] return func(*args, **kwargs)
[Stage-0] ERROR 01-03 17:00:34 [gpu_worker.py:226] ^^^^^^^^^^^^^^^^^^^^^
[Stage-0] ERROR 01-03 17:00:34 [gpu_worker.py:226] File "/root/vllm-workspace/vllm-omni/vllm_omni/diffusion/worker/gpu_worker.py", line 140, in execute_model
[Stage-0] ERROR 01-03 17:00:34 [gpu_worker.py:226] output = self.pipeline.forward(req)
[Stage-0] ERROR 01-03 17:00:34 [gpu_worker.py:226] ^^^^^^^^^^^^^^^^^^^^^^^^^^
[Stage-0] ERROR 01-03 17:00:34 [gpu_worker.py:226] File "/root/vllm-workspace/vllm-omni/vllm_omni/diffusion/models/qwen_image/pipeline_qwen_image.py", line 717, in forward
[Stage-0] ERROR 01-03 17:00:34 [gpu_worker.py:226] latents = self.diffuse(
[Stage-0] ERROR 01-03 17:00:34 [gpu_worker.py:226] ^^^^^^^^^^^^^
[Stage-0] ERROR 01-03 17:00:34 [gpu_worker.py:226] File "/root/vllm-workspace/vllm-omni/vllm_omni/diffusion/models/qwen_image/pipeline_qwen_image.py", line 556, in diffuse
[Stage-0] ERROR 01-03 17:00:34 [gpu_worker.py:226] noise_pred = self.transformer(
[Stage-0] ERROR 01-03 17:00:34 [gpu_worker.py:226] ^^^^^^^^^^^^^^^^^
[Stage-0] ERROR 01-03 17:00:34 [gpu_worker.py:226] File "/usr/local/python3.11.13/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1773, in _wrapped_call_impl
[Stage-0] ERROR 01-03 17:00:34 [gpu_worker.py:226] return self._call_impl(*args, **kwargs)
[Stage-0] ERROR 01-03 17:00:34 [gpu_worker.py:226] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[Stage-0] ERROR 01-03 17:00:34 [gpu_worker.py:226] File "/usr/local/python3.11.13/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1784, in _call_impl
[Stage-0] ERROR 01-03 17:00:34 [gpu_worker.py:226] return forward_call(*args, **kwargs)
[Stage-0] ERROR 01-03 17:00:34 [gpu_worker.py:226] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[Stage-0] ERROR 01-03 17:00:34 [gpu_worker.py:226] File "/usr/local/python3.11.13/lib/python3.11/site-packages/cache_dit/caching/cache_adapters/cache_adapter.py", line 439, in new_forward_with_hf_hook
[Stage-0] ERROR 01-03 17:00:34 [gpu_worker.py:226] outputs = new_forward(self, *args, **kwargs)
[Stage-0] ERROR 01-03 17:00:34 [gpu_worker.py:226] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[Stage-0] ERROR 01-03 17:00:34 [gpu_worker.py:226] File "/usr/local/python3.11.13/lib/python3.11/site-packages/cache_dit/caching/cache_adapters/cache_adapter.py", line 427, in new_forward
[Stage-0] ERROR 01-03 17:00:34 [gpu_worker.py:226] outputs = original_forward(*args, **kwargs)
[Stage-0] ERROR 01-03 17:00:34 [gpu_worker.py:226] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[Stage-0] ERROR 01-03 17:00:34 [gpu_worker.py:226] File "/root/vllm-workspace/vllm-omni/vllm_omni/diffusion/models/qwen_image/qwen_image_transformer.py", line 784, in forward
[Stage-0] ERROR 01-03 17:00:34 [gpu_worker.py:226] encoder_hidden_states, hidden_states = block(
[Stage-0] ERROR 01-03 17:00:34 [gpu_worker.py:226] ^^^^^^
[Stage-0] ERROR 01-03 17:00:34 [gpu_worker.py:226] File "/usr/local/python3.11.13/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1773, in _wrapped_call_impl
[Stage-0] ERROR 01-03 17:00:34 [gpu_worker.py:226] return self._call_impl(*args, **kwargs)
[Stage-0] ERROR 01-03 17:00:34 [gpu_worker.py:226] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[Stage-0] ERROR 01-03 17:00:34 [gpu_worker.py:226] File "/usr/local/python3.11.13/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1784, in _call_impl
[Stage-0] ERROR 01-03 17:00:34 [gpu_worker.py:226] return forward_call(*args, **kwargs)
[Stage-0] ERROR 01-03 17:00:34 [gpu_worker.py:226] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[Stage-0] ERROR 01-03 17:00:34 [gpu_worker.py:226] File "/usr/local/python3.11.13/lib/python3.11/site-packages/cache_dit/caching/cache_blocks/pattern_base.py", line 250, in forward
[Stage-0] ERROR 01-03 17:00:34 [gpu_worker.py:226] hidden_states, encoder_hidden_states = self.call_Fn_blocks(
[Stage-0] ERROR 01-03 17:00:34 [gpu_worker.py:226] ^^^^^^^^^^^^^^^^^^^^
[Stage-0] ERROR 01-03 17:00:34 [gpu_worker.py:226] File "/usr/local/python3.11.13/lib/python3.11/site-packages/cache_dit/caching/cache_blocks/pattern_base.py", line 426, in call_Fn_blocks
[Stage-0] ERROR 01-03 17:00:34 [gpu_worker.py:226] hidden_states = block(
[Stage-0] ERROR 01-03 17:00:34 [gpu_worker.py:226] ^^^^^^
[Stage-0] ERROR 01-03 17:00:34 [gpu_worker.py:226] File "/usr/local/python3.11.13/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1773, in _wrapped_call_impl
[Stage-0] ERROR 01-03 17:00:34 [gpu_worker.py:226] return self._call_impl(*args, **kwargs)
[Stage-0] ERROR 01-03 17:00:34 [gpu_worker.py:226] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[Stage-0] ERROR 01-03 17:00:34 [gpu_worker.py:226] File "/usr/local/python3.11.13/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1784, in _call_impl
[Stage-0] ERROR 01-03 17:00:34 [gpu_worker.py:226] return forward_call(*args, **kwargs)
[Stage-0] ERROR 01-03 17:00:34 [gpu_worker.py:226] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[Stage-0] ERROR 01-03 17:00:34 [gpu_worker.py:226] File "/root/vllm-workspace/vllm-omni/vllm_omni/diffusion/models/qwen_image/qwen_image_transformer.py", line 575, in forward
[Stage-0] ERROR 01-03 17:00:34 [gpu_worker.py:226] attn_output = self.attn(
[Stage-0] ERROR 01-03 17:00:34 [gpu_worker.py:226] ^^^^^^^^^^
[Stage-0] ERROR 01-03 17:00:34 [gpu_worker.py:226] File "/usr/local/python3.11.13/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1773, in _wrapped_call_impl
[Stage-0] ERROR 01-03 17:00:34 [gpu_worker.py:226] return self._call_impl(*args, **kwargs)
[Stage-0] ERROR 01-03 17:00:34 [gpu_worker.py:226] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[Stage-0] ERROR 01-03 17:00:34 [gpu_worker.py:226] File "/usr/local/python3.11.13/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1784, in _call_impl
[Stage-0] ERROR 01-03 17:00:34 [gpu_worker.py:226] return forward_call(*args, **kwargs)
[Stage-0] ERROR 01-03 17:00:34 [gpu_worker.py:226] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[Stage-0] ERROR 01-03 17:00:34 [gpu_worker.py:226] File "/root/vllm-workspace/vllm-omni/vllm_omni/diffusion/models/qwen_image/qwen_image_transformer.py", line 427, in forward
[Stage-0] ERROR 01-03 17:00:34 [gpu_worker.py:226] joint_hidden_states = self.attn(
[Stage-0] ERROR 01-03 17:00:34 [gpu_worker.py:226] ^^^^^^^^^^
[Stage-0] ERROR 01-03 17:00:34 [gpu_worker.py:226] File "/usr/local/python3.11.13/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1773, in _wrapped_call_impl
[Stage-0] ERROR 01-03 17:00:34 [gpu_worker.py:226] return self._call_impl(*args, **kwargs)
[Stage-0] ERROR 01-03 17:00:34 [gpu_worker.py:226] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[Stage-0] ERROR 01-03 17:00:34 [gpu_worker.py:226] File "/usr/local/python3.11.13/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1784, in _call_impl
[Stage-0] ERROR 01-03 17:00:34 [gpu_worker.py:226] return forward_call(*args, **kwargs)
[Stage-0] ERROR 01-03 17:00:34 [gpu_worker.py:226] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[Stage-0] ERROR 01-03 17:00:34 [gpu_worker.py:226] File "/root/vllm-workspace/vllm-omni/vllm_omni/diffusion/attention/layer.py", line 102, in forward
[Stage-0] ERROR 01-03 17:00:34 [gpu_worker.py:226] out = self.parallel_strategy.post_attention(out, ctx)
[Stage-0] ERROR 01-03 17:00:34 [gpu_worker.py:226] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[Stage-0] ERROR 01-03 17:00:34 [gpu_worker.py:226] File "/root/vllm-workspace/vllm-omni/vllm_omni/diffusion/attention/parallel/ulysses.py", line 196, in post_attention
[Stage-0] ERROR 01-03 17:00:34 [gpu_worker.py:226] dist.all_gather(gathered_joint, output_joint, group=ctx.ulysses_pg)
[Stage-0] ERROR 01-03 17:00:34 [gpu_worker.py:226] File "/usr/local/python3.11.13/lib/python3.11/site-packages/torch/distributed/c10d_logger.py", line 81, in wrapper
[Stage-0] ERROR 01-03 17:00:34 [gpu_worker.py:226] return func(*args, **kwargs)
[Stage-0] ERROR 01-03 17:00:34 [gpu_worker.py:226] ^^^^^^^^^^^^^^^^^^^^^
[Stage-0] ERROR 01-03 17:00:34 [gpu_worker.py:226] File "/usr/local/python3.11.13/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py", line 3879, in all_gather
[Stage-0] ERROR 01-03 17:00:34 [gpu_worker.py:226] work = group.allgather([tensor_list], [tensor], opts)
[Stage-0] ERROR 01-03 17:00:34 [gpu_worker.py:226] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[Stage-0] ERROR 01-03 17:00:34 [gpu_worker.py:226] RuntimeError: Tensors must be contiguous
[Stage-0] ERROR 01-03 17:00:34 [gpu_worker.py:226] [ERROR] 2026-01-03-17:00:34 (PID:1242551, Device:1, RankID:1) ERR02002 DIST invalid type
[Stage-0] INFO 01-03 17:00:34 [gpu_worker.py:265] Worker 1: Received shutdown message
Your current environment
The output of
python collect_env.py🐛 Describe the bug
Before submitting a new issue...