[Fix] GLM Image#799
Conversation
Signed-off-by: JaredforReal <[email protected]>
|
LGTM |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: a109b772ef
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
There was a problem hiding this comment.
Pull request overview
This pull request fixes the Image2Image mode for GLM-Image by adding a preprocessing function and refining postprocessing usage. The changes move image preprocessing logic out of the pipeline's forward method into a dedicated pre_process_func that runs before batching, following the pattern used by other pipelines like QwenImageEdit.
Changes:
- Added
get_glm_image_pre_process_functo handle condition image preprocessing before pipeline execution - Refactored AR token generation to use processor-generated image grids directly
- Fixed critical parameter name bug (
kv_caches→kv_cache) in transformer calls - Moved postprocessing out of pipeline's forward method to external post_process_func
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 5 comments.
| File | Description |
|---|---|
| vllm_omni/diffusion/registry.py | Registered the new GLM-Image preprocessing function in the registry |
| vllm_omni/diffusion/models/glm_image/pipeline_glm_image.py | Added preprocessing function, refactored AR generation logic, fixed kv_cache parameter name bug, and delegated postprocessing to external handler |
| vllm_omni/diffusion/models/glm_image/init.py | Exported the new preprocessing function |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Signed-off-by: JaredforReal <[email protected]>
Signed-off-by: JaredforReal <[email protected]>
* init and registry Signed-off-by: JaredforReal <[email protected]> * implement glm_image_transformer.py Signed-off-by: JaredforReal <[email protected]> * update transformer Signed-off-by: JaredforReal <[email protected]> * init pipeline_glm_image.py Signed-off-by: JaredforReal <[email protected]> * init pipeline_glm_image.py Signed-off-by: JaredforReal <[email protected]> * remove pre process Signed-off-by: JaredforReal <[email protected]> * add check_input(), implement CFG parallel in diffuse(), align generate_prior_tokens Signed-off-by: JaredforReal <[email protected]> * fix check_input(prompt_embed), add KVCache for Image Edit Signed-off-by: JaredforReal <[email protected]> * print out vllm version Signed-off-by: root <[email protected]> * update model config Signed-off-by: tzhouam <[email protected]> * update worker Signed-off-by: tzhouam <[email protected]> * update one import in AsyncOmniLLM (not finish all, but can run) Signed-off-by: tzhouam <[email protected]> * update Qwen3 Omni ViT init based on updated interface (the update for Qwen3 Omni Thinker is not finished) Signed-off-by: tzhouam <[email protected]> * Remove unnecessary override for OmniRequestState (the update for OmniRequestState is not finished) Signed-off-by: tzhouam <[email protected]> * update model runner dummy run Signed-off-by: tzhouam <[email protected]> * update ar scheduler Signed-off-by: tzhouam <[email protected]> * update _preprocess, execute model and sample_tokens for AR Model Runner Signed-off-by: tzhouam <[email protected]> * debug AR Scheduler Signed-off-by: tzhouam <[email protected]> * update OmniGPUModelRunner._update_states Signed-off-by: tzhouam <[email protected]> * update the offline LLM request sorting due to changed requested id format Signed-off-by: tzhouam <[email protected]> * update Qwen3 Omni to fit with the engine core logic Signed-off-by: tzhouam <[email protected]> * update generation model runner Signed-off-by: tzhouam <[email protected]> * debug GLM-Image Model Signed-off-by: tzhouam <[email protected]> * remove deleted args from doc string Signed-off-by: tzhouam <[email protected]> * [Model][Rebase] Add GLM-Image Model and Partial Rebase to v0.14.0 (Support AR Offiline) (vllm-project#763) Signed-off-by: JaredforReal <[email protected]> Signed-off-by: root <[email protected]> Signed-off-by: tzhouam <[email protected]> Co-authored-by: JaredforReal <[email protected]> Co-authored-by: root <[email protected]> * disable async scheduling for generation models, avoiding inconsistency from race condition Signed-off-by: tzhouam <[email protected]> * Update Qwen 3 Omni Signed-off-by: tzhouam <[email protected]> * [Fix] GLM Image (vllm-project#799) Signed-off-by: JaredforReal <[email protected]> * support online serving for Qwen3 Omni Signed-off-by: tzhouam <[email protected]> * fix pre-commit Signed-off-by: tzhouam <[email protected]> * inherit engine outputs Signed-off-by: tzhouam <[email protected]> * supporting audio in video(not finished) Signed-off-by: tzhouam <[email protected]> * Update Qwen2.5 Omni model to version 0.14, adding support for image and video input processing, and refining position handling for MRoPE. Adjustments made to the YAML configuration to disable async scheduling for consistency. Code cleanup and formatting improvements included. Signed-off-by: Taichang Zhou <[email protected]> * debug qwen 2.5 Omni Signed-off-by: tzhouam <[email protected]> * update doc Signed-off-by: tzhouam <[email protected]> * rebase to vllm 0.14.0 Signed-off-by: tzhouam <[email protected]> * unify query type Signed-off-by: tzhouam <[email protected]> * fix build doc Signed-off-by: tzhouam <[email protected]> * Dev/rebase 0.14.0 (vllm-project#813) Signed-off-by: JaredforReal <[email protected]> Signed-off-by: root <[email protected]> Signed-off-by: tzhouam <[email protected]> Signed-off-by: TangPeng <[email protected]> Signed-off-by: XU Mingshi <[email protected]> Signed-off-by: mxuax <[email protected]> Signed-off-by: Sihyeon Jang <[email protected]> Signed-off-by: zjy0516 <[email protected]> Signed-off-by: iwzbi <[email protected]> Signed-off-by: ZeldaHuang <[email protected]> Signed-off-by: Yuhan Liu <[email protected]> Signed-off-by: Alicia <[email protected]> Signed-off-by: samithuang <[email protected]> Signed-off-by: linyueqian <[email protected]> Signed-off-by: David Chen <[email protected]> Signed-off-by: yinpeiqi <[email protected]> Signed-off-by: Didan Deng <[email protected]> Signed-off-by: Hongsheng Liu <[email protected]> Signed-off-by: wangyu31577 <[email protected]> Signed-off-by: princepride <[email protected]> Signed-off-by: Taichang Zhou <[email protected]> Signed-off-by: John Liu BUAA <[email protected]> Signed-off-by: Dinesh G <[email protected]> Signed-off-by: gDINESH13 <[email protected]> Co-authored-by: JaredforReal <[email protected]> Co-authored-by: root <[email protected]> Co-authored-by: JustQJ <[email protected]> Co-authored-by: XU Mingshi <[email protected]> Co-authored-by: Sihyeon Jang <[email protected]> Co-authored-by: Jiangyun Zhu <[email protected]> Co-authored-by: catcat <[email protected]> Co-authored-by: Ziming Huang <[email protected]> Co-authored-by: Yuhan Liu <[email protected]> Co-authored-by: Alicia <[email protected]> Co-authored-by: Samit <[email protected]> Co-authored-by: Yueqian Lin <[email protected]> Co-authored-by: WeiQing Chen <[email protected]> Co-authored-by: Peiqi Yin <[email protected]> Co-authored-by: Didan Deng <[email protected]> Co-authored-by: Hongsheng Liu <[email protected]> Co-authored-by: wangyu <[email protected]> Co-authored-by: wangyu31577 <[email protected]> Co-authored-by: 汪志鹏 <[email protected]> Co-authored-by: John Liu BUAA <[email protected]> Co-authored-by: D!NE$H <[email protected]> * update test import Signed-off-by: tzhouam <[email protected]> * update version from 0.14.0rc2 to 0.14.0 Signed-off-by: tzhouam <[email protected]> * set vllm config for all CI Signed-off-by: tzhouam <[email protected]> * update CI Signed-off-by: tzhouam <[email protected]> * Fix CPU offload OOM and performance issues in GLM-Image pipeline * Fix CPU offload OOM and performance issues in GLM-Image pipeline - Conditionally load vision_language_encoder, text_encoder, and vae to GPU only when CPU offload is disabled - Propagate cpu_offload_gb argument to enable_cpu_offload flag - Include vision_language_encoder in CPU offload hooks for proper AR model offloading - Fix device mismatch in generate_prior_tokens during CPU offload mode * Fix shared memory broadcast hang in GLM-Image pipeline - Add manual encoder activation support to SequentialOffloader - Explicitly trigger vision_language_encoder onload before get_image_features in pipeline - Prevents CPU-bound stalling during AR generation when offload is active * Fix device mismatch in generate() by triggering offload hook * Clean up temporary patch files --------- Signed-off-by: JaredforReal <[email protected]> Signed-off-by: root <[email protected]> Signed-off-by: tzhouam <[email protected]> Signed-off-by: Taichang Zhou <[email protected]> Signed-off-by: TangPeng <[email protected]> Signed-off-by: XU Mingshi <[email protected]> Signed-off-by: mxuax <[email protected]> Signed-off-by: Sihyeon Jang <[email protected]> Signed-off-by: zjy0516 <[email protected]> Signed-off-by: iwzbi <[email protected]> Signed-off-by: ZeldaHuang <[email protected]> Signed-off-by: Yuhan Liu <[email protected]> Signed-off-by: Alicia <[email protected]> Signed-off-by: samithuang <[email protected]> Signed-off-by: linyueqian <[email protected]> Signed-off-by: David Chen <[email protected]> Signed-off-by: yinpeiqi <[email protected]> Signed-off-by: Didan Deng <[email protected]> Signed-off-by: Hongsheng Liu <[email protected]> Signed-off-by: wangyu31577 <[email protected]> Signed-off-by: princepride <[email protected]> Signed-off-by: John Liu BUAA <[email protected]> Signed-off-by: Dinesh G <[email protected]> Signed-off-by: gDINESH13 <[email protected]> Co-authored-by: JaredforReal <[email protected]> Co-authored-by: root <[email protected]> Co-authored-by: tzhouam <[email protected]> Co-authored-by: JustQJ <[email protected]> Co-authored-by: XU Mingshi <[email protected]> Co-authored-by: Sihyeon Jang <[email protected]> Co-authored-by: Jiangyun Zhu <[email protected]> Co-authored-by: catcat <[email protected]> Co-authored-by: Ziming Huang <[email protected]> Co-authored-by: Yuhan Liu <[email protected]> Co-authored-by: Alicia <[email protected]> Co-authored-by: Samit <[email protected]> Co-authored-by: Yueqian Lin <[email protected]> Co-authored-by: WeiQing Chen <[email protected]> Co-authored-by: Peiqi Yin <[email protected]> Co-authored-by: Didan Deng <[email protected]> Co-authored-by: Hongsheng Liu <[email protected]> Co-authored-by: wangyu <[email protected]> Co-authored-by: wangyu31577 <[email protected]> Co-authored-by: 汪志鹏 <[email protected]> Co-authored-by: John Liu BUAA <[email protected]> Co-authored-by: D!NE$H <[email protected]> Co-authored-by: Copilot <[email protected]>
Purpose
Test Plan
python image_edit.py --model GLM-Image --image qwen_image_output.png --prompt "make it cartoon style"Test Result
Original:


Edited:
Detailed Logs:
Essential Elements of an Effective PR Description Checklist
supported_models.mdandexamplesfor a new model.BEFORE SUBMITTING, PLEASE READ https://github.com/vllm-project/vllm-omni/blob/main/CONTRIBUTING.md (anything written below this line will be removed by GitHub Actions)