[Bugfix] Fix image edit RoPE crash when explicit height/width are provided by lishunyang12 · Pull Request #1265 · vllm-project/vllm-omni

lishunyang12 · 2026-02-07T20:20:10Z

Purpose

Fix a server crash (AssertionError: seqlen_ro must be >= seqlen) in the Qwen-Image-Edit online serving pipeline when users provide explicit height/width in extra_body for image editing with a non-square input image.

Root cause: In pre_process_func, the input image was resized to the user-specified (height, width) (e.g., 1024x1024), but the aspect-ratio-preserving calculated_height/calculated_width (e.g., 1056x992) were stored in additional_information. Downstream, the pipeline builds img_shapes from both:

img_shapes[0] = (1, 64, 64) from explicit 1024x1024 (noise latent)
img_shapes[1] = (1, 66, 62) from calculated 1056x992 (image latent)

The image latent's actual seq_len (4096) exceeded the RoPE seq_len computed from calculated_height/width (4092), triggering the assertion failure.

Fix: Resize and preprocess the input image using calculated_height/calculated_width instead of the explicit height/width, consistent with the fallback/debug path behavior (line 665). This ensures img_shapes[1] always matches the actual image latent dimensions.

Also fixes two issues in the example scripts:

The curl example crashed with Argument list too long on large images due to passing base64 data as a shell argument.
The Python client silently ignored --height/--width CLI arguments (never sent to server).

Test Plan

Restart the server with Qwen-Image-Edit model.
Test with non-square input image + explicit height/width via curl — should succeed instead of crashing:

IMG_B64=$(base64 -w0 input.png)

cat <<EOF > request.json
{
  "messages": [{
    "role": "user",
    "content": [
      {"type": "text", "text": "Convert this image to watercolor style"},
      {"type": "image_url", "image_url": {"url": "data:image/png;base64,$IMG_B64"}}
    ]
  }],
  "extra_body": {
    "height": 1024,
    "width": 1024,
    "num_inference_steps": 50,
    "guidance_scale": 1,
    "seed": 42
  }
}
EOF

curl -s http://localhost:8092/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d @request.json | jq -r '.choices[0].message.content[0].image_url.url' \
  | cut -d',' -f2 | base64 -d > output.png

Output (explicit height=1024, width=1024):

Test with curl example script (auto-computed dims):

bash examples/online_serving/image_to_image/run_curl_image_edit.sh input.png "Convert to watercolor style"

Output (auto-computed dims):

Test with Python client (no explicit dims) — should still work as before:

python examples/online_serving/image_to_image/openai_chat_client.py \
    --input input.png --prompt "Convert to watercolor style"

Output (Python client, no explicit dims):

Test with Python client (explicit dims) — should work now that height/width are sent:

python examples/online_serving/image_to_image/openai_chat_client.py \
    --input input.png --prompt "Convert to watercolor style" \
    --height 1024 --width 1024

Output (Python client, explicit height=1024, width=1024):

Test Result

Before this fix, sending a non-square image (e.g., 514x556) with explicit height: 1024, width: 1024 crashes the server:

AssertionError: seqlen_ro must be >= seqlen

After this fix, the same request completes successfully and returns a valid output image.

Changes

File	Change
`vllm_omni/diffusion/models/qwen_image/pipeline_qwen_image_edit.py`	Resize and preprocess input image using `calculated_height/width` instead of explicit `height/width` in `pre_process_func`, matching the fallback path
`examples/online_serving/image_to_image/run_curl_image_edit.sh`	Remove hardcoded `height`/`width` from `extra_body`; pipe base64 via stdin and use temp file for curl payload to avoid `ARG_MAX` limit on large images
`examples/online_serving/image_to_image/openai_chat_client.py`	Wire `height`/`width` into `extra_body` so CLI args actually take effect

Essential Elements of an Effective PR Description Checklist

The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
The test plan, such as providing test command.
The test results, such as pasting the results comparison before and after, or e2e results
(Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
(Optional) Release notes update. If your change is user facing, please update the release notes draft.

Signed-off-by: lishunyang <[email protected]>

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 3757b3f0ae

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

chatgpt-codex-connector · 2026-02-07T20:23:02Z

examples/online_serving/image_to_image/openai_chat_client.py

+    if height is not None:
+        extra_body["height"] = height
+    if width is not None:
+        extra_body["width"] = width


Allow omitting explicit height/width in chat client

After this change, edit_image forwards height/width whenever they are non-None, but main() still sets both CLI args to default 1024, so running the documented command without --height/--width now always sends square dimensions and disables the server’s auto aspect-ratio sizing path for non-square inputs. This is a behavior regression introduced by wiring these fields; if the intent is “optional explicit dims,” the parser defaults need to be None (or equivalent flag-presence detection) so omitted flags stay omitted in extra_body.

Useful? React with 👍 / 👎.

Signed-off-by: lishunyang <[email protected]>

hsliuustc0106 · 2026-02-07T23:05:27Z

@Bounty-hunter @ZJY0516 PTAL

Bounty-hunter · 2026-02-10T06:29:19Z

@Bounty-hunter @ZJY0516 PTAL

LGTM, the new logic is consistent with Diffusers.

ZJY0516

LGTM. Could you please also check if other qwen models also has same issue?

lishunyang12 · 2026-02-10T20:44:38Z

LGTM. Could you please also check if other qwen models also has same issue?

I will keep track if this issues can apply to other models.

…vided (vllm-project#1265) Signed-off-by: lishunyang <[email protected]> Co-authored-by: Hongsheng Liu <[email protected]>

lishunyang12 · 2026-02-25T11:15:40Z

Checked all Qwen pipelines:

pipeline_qwen_image.py — text-to-image only, no image input, single img_shapes entry. Not affected.
pipeline_qwen_image_edit_plus.py — written with separate CONDITION_IMAGE_SIZE/VAE_IMAGE_SIZE from the start, img_shapes consistently uses VAE dims. Not affected.
pipeline_qwen_image_layered.py — height/width always set to calculated_height/calculated_width (no user override path), so no mismatch possible. Not affected.

Only pipeline_qwen_image_edit.py had this issue.

initial commit

3757b3f

Signed-off-by: lishunyang <[email protected]>

lishunyang12 requested a review from hsliuustc0106 as a code owner February 7, 2026 20:20

chatgpt-codex-connector bot reviewed Feb 7, 2026

View reviewed changes

lishunyang12 added 2 commits February 8, 2026 04:38

fix: solve large file passing in issue

dae10db

Signed-off-by: lishunyang <[email protected]>

fix precommit

f3d5e5f

Signed-off-by: lishunyang <[email protected]>

Gaohan123 added the ready label to trigger buildkite CI label Feb 10, 2026

Gaohan123 added this to the v0.16.0 milestone Feb 10, 2026

ZJY0516 approved these changes Feb 10, 2026

View reviewed changes

ZJY0516 requested a review from SamitHuang February 10, 2026 07:28

Merge branch 'main' into online

322ba1d

hsliuustc0106 enabled auto-merge (squash) February 10, 2026 20:55

hsliuustc0106 merged commit 8a9644c into vllm-project:main Feb 10, 2026
7 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bugfix] Fix image edit RoPE crash when explicit height/width are provided#1265

[Bugfix] Fix image edit RoPE crash when explicit height/width are provided#1265
hsliuustc0106 merged 4 commits intovllm-project:mainfrom
lishunyang12:online

lishunyang12 commented Feb 7, 2026 •

edited

Loading

Uh oh!

chatgpt-codex-connector bot left a comment

Uh oh!

chatgpt-codex-connector bot Feb 7, 2026

Uh oh!

hsliuustc0106 commented Feb 7, 2026

Uh oh!

Bounty-hunter commented Feb 10, 2026

Uh oh!

ZJY0516 left a comment

Uh oh!

lishunyang12 commented Feb 10, 2026

Uh oh!

Uh oh!

lishunyang12 commented Feb 25, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Conversation

lishunyang12 commented Feb 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Test Plan

Test Result

Changes

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector bot Feb 7, 2026

Choose a reason for hiding this comment

Uh oh!

hsliuustc0106 commented Feb 7, 2026

Uh oh!

Bounty-hunter commented Feb 10, 2026

Uh oh!

ZJY0516 left a comment

Choose a reason for hiding this comment

Uh oh!

lishunyang12 commented Feb 10, 2026

Uh oh!

Uh oh!

lishunyang12 commented Feb 25, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

lishunyang12 commented Feb 7, 2026 •

edited

Loading