Skip to content

[Bugfix] Fix image edit RoPE crash when explicit height/width are provided#1265

Merged
hsliuustc0106 merged 4 commits intovllm-project:mainfrom
lishunyang12:online
Feb 10, 2026
Merged

[Bugfix] Fix image edit RoPE crash when explicit height/width are provided#1265
hsliuustc0106 merged 4 commits intovllm-project:mainfrom
lishunyang12:online

Conversation

@lishunyang12
Copy link
Contributor

@lishunyang12 lishunyang12 commented Feb 7, 2026

Purpose

Fix a server crash (AssertionError: seqlen_ro must be >= seqlen) in the Qwen-Image-Edit online serving pipeline when users provide explicit height/width in extra_body for image editing with a non-square input image.

Root cause: In pre_process_func, the input image was resized to the user-specified (height, width) (e.g., 1024x1024), but the aspect-ratio-preserving calculated_height/calculated_width (e.g., 1056x992) were stored in additional_information. Downstream, the pipeline builds img_shapes from both:

  • img_shapes[0] = (1, 64, 64) from explicit 1024x1024 (noise latent)
  • img_shapes[1] = (1, 66, 62) from calculated 1056x992 (image latent)

The image latent's actual seq_len (4096) exceeded the RoPE seq_len computed from calculated_height/width (4092), triggering the assertion failure.

Fix: Resize and preprocess the input image using calculated_height/calculated_width instead of the explicit height/width, consistent with the fallback/debug path behavior (line 665). This ensures img_shapes[1] always matches the actual image latent dimensions.

Also fixes two issues in the example scripts:

  • The curl example crashed with Argument list too long on large images due to passing base64 data as a shell argument.
  • The Python client silently ignored --height/--width CLI arguments (never sent to server).

Test Plan

  1. Restart the server with Qwen-Image-Edit model.

  2. Test with non-square input image + explicit height/width via curl — should succeed instead of crashing:

IMG_B64=$(base64 -w0 input.png)

cat <<EOF > request.json
{
  "messages": [{
    "role": "user",
    "content": [
      {"type": "text", "text": "Convert this image to watercolor style"},
      {"type": "image_url", "image_url": {"url": "data:image/png;base64,$IMG_B64"}}
    ]
  }],
  "extra_body": {
    "height": 1024,
    "width": 1024,
    "num_inference_steps": 50,
    "guidance_scale": 1,
    "seed": 42
  }
}
EOF

curl -s http://localhost:8092/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d @request.json | jq -r '.choices[0].message.content[0].image_url.url' \
  | cut -d',' -f2 | base64 -d > output.png

Output (explicit height=1024, width=1024):

output (3)
  1. Test with curl example script (auto-computed dims):
bash examples/online_serving/image_to_image/run_curl_image_edit.sh input.png "Convert to watercolor style"

Output (auto-computed dims):

output (3)
  1. Test with Python client (no explicit dims) — should still work as before:
python examples/online_serving/image_to_image/openai_chat_client.py \
    --input input.png --prompt "Convert to watercolor style"

Output (Python client, no explicit dims):

output (3)
  1. Test with Python client (explicit dims) — should work now that height/width are sent:
python examples/online_serving/image_to_image/openai_chat_client.py \
    --input input.png --prompt "Convert to watercolor style" \
    --height 1024 --width 1024

Output (Python client, explicit height=1024, width=1024):

output (3)

Test Result

Before this fix, sending a non-square image (e.g., 514x556) with explicit height: 1024, width: 1024 crashes the server:

AssertionError: seqlen_ro must be >= seqlen

After this fix, the same request completes successfully and returns a valid output image.

Changes

File Change
vllm_omni/diffusion/models/qwen_image/pipeline_qwen_image_edit.py Resize and preprocess input image using calculated_height/width instead of explicit height/width in pre_process_func, matching the fallback path
examples/online_serving/image_to_image/run_curl_image_edit.sh Remove hardcoded height/width from extra_body; pipe base64 via stdin and use temp file for curl payload to avoid ARG_MAX limit on large images
examples/online_serving/image_to_image/openai_chat_client.py Wire height/width into extra_body so CLI args actually take effect

Essential Elements of an Effective PR Description Checklist
  • The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
  • The test plan, such as providing test command.
  • The test results, such as pasting the results comparison before and after, or e2e results
  • (Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
  • (Optional) Release notes update. If your change is user facing, please update the release notes draft.

Signed-off-by: lishunyang <[email protected]>
Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 3757b3f0ae

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment on lines +79 to +82
if height is not None:
extra_body["height"] = height
if width is not None:
extra_body["width"] = width

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Allow omitting explicit height/width in chat client

After this change, edit_image forwards height/width whenever they are non-None, but main() still sets both CLI args to default 1024, so running the documented command without --height/--width now always sends square dimensions and disables the server’s auto aspect-ratio sizing path for non-square inputs. This is a behavior regression introduced by wiring these fields; if the intent is “optional explicit dims,” the parser defaults need to be None (or equivalent flag-presence detection) so omitted flags stay omitted in extra_body.

Useful? React with 👍 / 👎.

@hsliuustc0106
Copy link
Collaborator

@Bounty-hunter @ZJY0516 PTAL

@Bounty-hunter
Copy link
Contributor

@Bounty-hunter @ZJY0516 PTAL

LGTM, the new logic is consistent with Diffusers.

@Gaohan123 Gaohan123 added the ready label to trigger buildkite CI label Feb 10, 2026
@Gaohan123 Gaohan123 added this to the v0.16.0 milestone Feb 10, 2026
Copy link
Collaborator

@ZJY0516 ZJY0516 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Could you please also check if other qwen models also has same issue?

@ZJY0516 ZJY0516 requested a review from SamitHuang February 10, 2026 07:28
@lishunyang12
Copy link
Contributor Author

LGTM. Could you please also check if other qwen models also has same issue?

I will keep track if this issues can apply to other models.

@hsliuustc0106 hsliuustc0106 enabled auto-merge (squash) February 10, 2026 20:55
@hsliuustc0106 hsliuustc0106 merged commit 8a9644c into vllm-project:main Feb 10, 2026
7 checks passed
YanickSchraner pushed a commit to YanickSchraner/vllm-omni that referenced this pull request Feb 20, 2026
@lishunyang12
Copy link
Contributor Author

Checked all Qwen pipelines:

  • pipeline_qwen_image.py — text-to-image only, no image input, single img_shapes entry. Not affected.
  • pipeline_qwen_image_edit_plus.py — written with separate CONDITION_IMAGE_SIZE/VAE_IMAGE_SIZE from the start, img_shapes consistently uses VAE dims. Not affected.
  • pipeline_qwen_image_layered.pyheight/width always set to calculated_height/calculated_width (no user override path), so no mismatch possible. Not affected.

Only pipeline_qwen_image_edit.py had this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ready label to trigger buildkite CI

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants