[Bugfix] Fix image edit RoPE crash when explicit height/width are provided#1265
[Bugfix] Fix image edit RoPE crash when explicit height/width are provided#1265hsliuustc0106 merged 4 commits intovllm-project:mainfrom
Conversation
Signed-off-by: lishunyang <[email protected]>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 3757b3f0ae
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| if height is not None: | ||
| extra_body["height"] = height | ||
| if width is not None: | ||
| extra_body["width"] = width |
There was a problem hiding this comment.
Allow omitting explicit height/width in chat client
After this change, edit_image forwards height/width whenever they are non-None, but main() still sets both CLI args to default 1024, so running the documented command without --height/--width now always sends square dimensions and disables the server’s auto aspect-ratio sizing path for non-square inputs. This is a behavior regression introduced by wiring these fields; if the intent is “optional explicit dims,” the parser defaults need to be None (or equivalent flag-presence detection) so omitted flags stay omitted in extra_body.
Useful? React with 👍 / 👎.
Signed-off-by: lishunyang <[email protected]>
Signed-off-by: lishunyang <[email protected]>
|
@Bounty-hunter @ZJY0516 PTAL |
LGTM, the new logic is consistent with Diffusers. |
ZJY0516
left a comment
There was a problem hiding this comment.
LGTM. Could you please also check if other qwen models also has same issue?
I will keep track if this issues can apply to other models. |
…vided (vllm-project#1265) Signed-off-by: lishunyang <[email protected]> Co-authored-by: Hongsheng Liu <[email protected]>
|
Checked all Qwen pipelines:
Only |
Purpose
Fix a server crash (
AssertionError: seqlen_ro must be >= seqlen) in the Qwen-Image-Edit online serving pipeline when users provide explicitheight/widthinextra_bodyfor image editing with a non-square input image.Root cause: In
pre_process_func, the input image was resized to the user-specified(height, width)(e.g., 1024x1024), but the aspect-ratio-preservingcalculated_height/calculated_width(e.g., 1056x992) were stored inadditional_information. Downstream, the pipeline buildsimg_shapesfrom both:img_shapes[0]=(1, 64, 64)from explicit 1024x1024 (noise latent)img_shapes[1]=(1, 66, 62)from calculated 1056x992 (image latent)The image latent's actual seq_len (4096) exceeded the RoPE seq_len computed from
calculated_height/width(4092), triggering the assertion failure.Fix: Resize and preprocess the input image using
calculated_height/calculated_widthinstead of the explicitheight/width, consistent with the fallback/debug path behavior (line 665). This ensuresimg_shapes[1]always matches the actual image latent dimensions.Also fixes two issues in the example scripts:
Argument list too longon large images due to passing base64 data as a shell argument.--height/--widthCLI arguments (never sent to server).Test Plan
Restart the server with Qwen-Image-Edit model.
Test with non-square input image + explicit
height/widthvia curl — should succeed instead of crashing:Output (explicit height=1024, width=1024):
bash examples/online_serving/image_to_image/run_curl_image_edit.sh input.png "Convert to watercolor style"Output (auto-computed dims):
python examples/online_serving/image_to_image/openai_chat_client.py \ --input input.png --prompt "Convert to watercolor style"Output (Python client, no explicit dims):
height/widthare sent:python examples/online_serving/image_to_image/openai_chat_client.py \ --input input.png --prompt "Convert to watercolor style" \ --height 1024 --width 1024Output (Python client, explicit height=1024, width=1024):
Test Result
Before this fix, sending a non-square image (e.g., 514x556) with explicit
height: 1024, width: 1024crashes the server:After this fix, the same request completes successfully and returns a valid output image.
Changes
vllm_omni/diffusion/models/qwen_image/pipeline_qwen_image_edit.pycalculated_height/widthinstead of explicitheight/widthinpre_process_func, matching the fallback pathexamples/online_serving/image_to_image/run_curl_image_edit.shheight/widthfromextra_body; pipe base64 via stdin and use temp file for curl payload to avoidARG_MAXlimit on large imagesexamples/online_serving/image_to_image/openai_chat_client.pyheight/widthintoextra_bodyso CLI args actually take effectEssential Elements of an Effective PR Description Checklist
supported_models.mdandexamplesfor a new model.