[diffusion] add tp support for qwen-image and refactor some tests#830
[diffusion] add tp support for qwen-image and refactor some tests#830SamitHuang merged 4 commits intovllm-project:mainfrom
Conversation
Signed-off-by: zjy0516 <[email protected]>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 35a98fb248
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| torch.cuda.empty_cache() | ||
| device_index = torch.cuda.current_device() | ||
| monitor = GPUMemoryMonitor(device_index=device_index, interval=0.02) | ||
| monitor.start() |
There was a problem hiding this comment.
Reset CUDA peak stats before collecting TP memory
GPUMemoryMonitor.peak_used_mb falls back to torch.cuda.max_memory_allocated/reserved, which are process‑wide peaks and are not reset by empty_cache(). Since _run_zimage_generate is invoked twice in the same process, the TP=2 run will inherit the TP=1 peak and can never be lower even if it actually uses less memory, making the new assertion flaky. Consider calling torch.cuda.reset_peak_memory_stats(device_index) before starting the monitor (or dropping the max_memory fallback) so each run measures its own peak.
Useful? React with 👍 / 👎.
Signed-off-by: zjy0516 <[email protected]>
Signed-off-by: zjy0516 <[email protected]>
Signed-off-by: zjy0516 <[email protected]>
…lm-project#830) Signed-off-by: zjy0516 <[email protected]>
Purpose
Test
Qwen Image
Image size: 1024x1024
Qwen Image Edit
Input image size: (1242, 1483)
Essential Elements of an Effective PR Description Checklist
supported_models.mdandexamplesfor a new model.BEFORE SUBMITTING, PLEASE READ https://github.com/vllm-project/vllm-omni/blob/main/CONTRIBUTING.md (anything written below this line will be removed by GitHub Actions)