[model]Add UltraFlux-v1-image support#611
Conversation
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 7af3b2287a
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| "pipeline_wan2_2", | ||
| "Wan22Pipeline", | ||
| ), |
There was a problem hiding this comment.
Preserve existing diffusion model registrations
This registry block now omits several existing pipelines (e.g., WanImageToVideoPipeline, BagelPipeline, LongCatImageEditPipeline, StableDiffusion3Pipeline) even though their modules still exist under vllm_omni/diffusion/models. Since initialize_model looks up od_config.model_class_name in _DIFFUSION_MODELS and raises a ValueError when missing, any config that previously used those model_class_name values will now fail to initialize. Please restore those entries or explicitly deprecate them across docs/configs.
Useful? React with 👍 / 👎.
vllm_omni/diffusion/registry.py
Outdated
| #UltraFlux | ||
| "FluxPipeline": ( | ||
| "ultraflux-v1_image", | ||
| "pipeline_ultraflux", | ||
| "UltraFluxPipeline", |
There was a problem hiding this comment.
Register UltraFlux under UltraFluxPipeline key
The new UltraFlux entry is keyed as FluxPipeline, but the actual class and docs use UltraFluxPipeline. If users set model_class_name=UltraFluxPipeline as documented, _DIFFUSION_MODELS lookup will fail and initialize_model will raise “Model class UltraFluxPipeline not found.” Please register it under UltraFluxPipeline (or update naming consistently) so the model is loadable.
Useful? React with 👍 / 👎.
| { | ||
| "WanPipeline": enable_cache_for_wan22, | ||
| "FluxPipeline": enable_cache_for_flux, | ||
| "UltraFluxPipeline": enable_cache_for_flux, |
There was a problem hiding this comment.
maybe you need to update new func name such as enable_cache_for_ultraflux
| "pipeline_wan2_2", | ||
| "Wan22Pipeline", | ||
| ), | ||
| "WanImageToVideoPipeline": ( |
There was a problem hiding this comment.
do not delete, just add your pipeline
vllm_omni/diffusion/registry.py
Outdated
| "StableDiffusion3Pipeline", | ||
| ), | ||
| #UltraFlux | ||
| "FluxPipeline": ( |
There was a problem hiding this comment.
rename UltraFluxPipeline
vllm_omni/diffusion/registry.py
Outdated
| od_config: OmniDiffusionConfig, | ||
| ): | ||
| model_class = DiffusionModelRegistry._try_load_model_cls(od_config.model_class_name) | ||
| print("DEBUG model_class =", model_class) |
vllm_omni/diffusion/registry.py
Outdated
| "LongCatImagePipeline", | ||
| ), | ||
| "BagelPipeline": ( | ||
| "BagelPipeline": ( |
vllm_omni/diffusion/registry.py
Outdated
| # where mod_folder and mod_relname are defined and mapped using `_DIFFUSION_MODELS` via the `arch` key | ||
| "QwenImageEditPipeline": "get_qwen_image_edit_pre_process_func", | ||
| "QwenImageEditPlusPipeline": "get_qwen_image_edit_plus_pre_process_func", | ||
| "QwenImageLayeredPipeline": "get_qwen_image_layered_pre_process_func", |
| @@ -4,7 +4,7 @@ | |||
| import multiprocessing as mp | |||
There was a problem hiding this comment.
adding a model should not change diffusion_engine
| | `WanPipeline` | Wan2.2-T2V, Wan2.2-TI2V | `Wan-AI/Wan2.2-T2V-A14B-Diffusers`, `Wan-AI/Wan2.2-TI2V-5B-Diffusers` | | ||
| | `WanImageToVideoPipeline` | Wan2.2-I2V | `Wan-AI/Wan2.2-I2V-A14B-Diffusers` | | ||
| | `OvisImagePipeline` | Ovis-Image | `OvisAI/Ovis-Image` | | ||
| |`LongcatImagePipeline` | LongCat-Image | `meituan-longcat/LongCat-Image` | |
There was a problem hiding this comment.
mkdocs.yml
Outdated
| - "vllm_omni.entrypoints.async_diffusion" # avoid importing vllm in mkdocs building | ||
| - "vllm_omni.entrypoints.openai" # avoid importing vllm in mkdocs building | ||
| - "vllm_omni.entrypoints.openai.protocol" # avoid importing vllm in mkdocs building | ||
| - "vllm_omni.entrypoints.omni" # avoid importing vllm in mkdocs building |
There was a problem hiding this comment.
why we need to change this?
vllm_omni/entrypoints/async_omni.py
Outdated
|
|
||
| # Summarize and print stats | ||
| try: | ||
| import json as _json |
There was a problem hiding this comment.
why we need to change this? @Bounty-hunter PTAL
vllm_omni/entrypoints/log_utils.py
Outdated
| @@ -4,6 +4,7 @@ | |||
| from dataclasses import dataclass | |||
There was a problem hiding this comment.
this PR is designed for adding a model, you should not make any changes to these comment files
vllm_omni/entrypoints/omni.py
Outdated
| def _initialize_stages(self, model: str, kwargs: dict[str, Any]) -> None: | ||
| """Initialize stage list management.""" | ||
| stage_init_timeout = kwargs.get("stage_init_timeout", 20) | ||
| # Diffusion/large models can take long to load; align default with CLI (300s) |
There was a problem hiding this comment.
you should not change this default value, you need to provide this in your cli
vllm_omni/entrypoints/omni_stage.py
Outdated
| pass | ||
| logger.debug("Engine initialized") | ||
|
|
||
| # Check if stage engine supports profiling (via vLLM's built-in profiler) |
There was a problem hiding this comment.
why you need change omni_stage in the model support PR?
| } | ||
| ) | ||
|
|
||
| return result |
There was a problem hiding this comment.
should not change in this PR
560c1a1 to
448588b
Compare
vllm_omni/diffusion/registry.py
Outdated
| "StableDiffusion3Pipeline", | ||
| ), | ||
| #UltraFlux | ||
| "FluxPipeline": ( |
| @@ -0,0 +1,931 @@ | |||
| # Copyright 2025 Black Forest Labs, The HuggingFace Team and The InstantX Team. All rights reserved. | |||
| | `WanImageToVideoPipeline` | Wan2.2-I2V | `Wan-AI/Wan2.2-I2V-A14B-Diffusers` | | ||
| | `OvisImagePipeline` | Ovis-Image | `OvisAI/Ovis-Image` | | ||
| |`LongcatImagePipeline` | LongCat-Image | `meituan-longcat/LongCat-Image` | | ||
| |`LongCatImageEditPipeline` | LongCat-Image-Edit | `meituan-longcat/LongCat-Image-Edit` | |
There was a problem hiding this comment.
please also update diffusion acceleration md for cache dit support
|
@david6666666 can we not use benchmark serving under benchmarks folder for t2i jobs |
|
Please make similar modifications based on the review comments in #809.
|
Signed-off-by: Chen Yang <[email protected]>
Signed-off-by: Chen Yang <[email protected]>
Signed-off-by: Chen Yang <[email protected]>
…in Ring Attention (vllm-project#767) Signed-off-by: XU Mingshi <[email protected]> Signed-off-by: mxuax <[email protected]> Signed-off-by: Chen Yang <[email protected]>
Signed-off-by: Sihyeon Jang <[email protected]> Signed-off-by: Chen Yang <[email protected]>
Signed-off-by: Chen Yang <[email protected]>
Signed-off-by: Chen Yang <[email protected]>
Signed-off-by: Chen Yang <[email protected]>
Signed-off-by: Chen Yang <[email protected]>
Signed-off-by: Chen Yang <[email protected]>
Signed-off-by: Chen Yang <[email protected]>
Signed-off-by: Chen Yang <[email protected]>
Signed-off-by: Chen Yang <[email protected]>
Signed-off-by: zjy0516 <[email protected]> Signed-off-by: Chen Yang <[email protected]>
Signed-off-by: Chen Yang <[email protected]>
Signed-off-by: iwzbi <[email protected]> Signed-off-by: Chen Yang <[email protected]>
…#722) Signed-off-by: ZeldaHuang <[email protected]> Signed-off-by: Chen Yang <[email protected]>
…ject#781) Signed-off-by: Yuhan Liu <[email protected]> Signed-off-by: Chen Yang <[email protected]>
Signed-off-by: Chen Yang <[email protected]>
Signed-off-by: Chen Yang <[email protected]>
Signed-off-by: David Chen <[email protected]> Signed-off-by: Chen Yang <[email protected]> # Conflicts: # vllm_omni/diffusion/registry.py
Signed-off-by: Chen Yang <[email protected]>
Signed-off-by: Chen Yang <[email protected]>
8fc9d02 to
67a48fb
Compare
Signed-off-by: erfgss <[email protected]>
Signed-off-by: Chen Yang <[email protected]>
Signed-off-by: Chen Yang <[email protected]>
Signed-off-by: Chen Yang <[email protected]>
Signed-off-by: Chen Yang <[email protected]>
Signed-off-by: Chen Yang <[email protected]>
Signed-off-by: Chen Yang <[email protected]>
Signed-off-by: Chen Yang <[email protected]>
|
|
Signed-off-by: Chen Yang <[email protected]>
Signed-off-by: Chen Yang <[email protected]>
Signed-off-by: Chen Yang <[email protected]>
Signed-off-by: Chen Yang <[email protected]>
|
| |`Qwen3TTSForConditionalGeneration` | Qwen3-TTS-12Hz-1.7B-VoiceDesign | `Qwen/Qwen3-TTS-12Hz-1.7B-VoiceDesign` | | ||
| |`Qwen3TTSForConditionalGeneration` | Qwen3-TTS-12Hz-1.7B-Base | `Qwen/Qwen3-TTS-12Hz-0.6B-Base` | | ||
|
|
||
| |`UltraFluxPipeline` | UltraFlux-v1 | `Owen777/UltraFlux-v1` | |
There was a problem hiding this comment.
Looks like this diff might have accidentally removed the three Qwen3-TTS entries — probably a rebase artifact? The UltraFlux line should be added alongside them.
| self.default_sample_size = 64 | ||
|
|
||
| print(self.vae.config) | ||
| print(self.transformer.config) |
There was a problem hiding this comment.
A few debug print() calls here — might want to swap them for logger.debug() or remove before merging.
| scheduler=scheduler, | ||
| ) | ||
|
|
||
| self.vae_scale_factor = 32 |
There was a problem hiding this comment.
Quick question — vae_scale_factor = 32 while standard Flux uses 16. Is 32 correct for UltraFlux, or a copy-paste from somewhere? If it's intentional, a brief comment explaining why would be helpful.
| def _get_fused_projections(attn: "FluxAttention", hidden_states, encoder_hidden_states=None): | ||
| query, key, value = attn.to_qkv(hidden_states).chunk(3, dim=-1) | ||
|
|
||
| encoder_query = encoder_key = encoder_value = (None,) |
There was a problem hiding this comment.
I think there might be a small issue here — (None,) creates a 1-tuple rather than assigning None to all three variables. This could cause problems downstream when calling .unflatten(...) on a tuple. Maybe:
encoder_query = encoder_key = encoder_value = None| mscale = torch.where( | ||
| scale <= 1.0, torch.tensor(1.0, device=scale.device, dtype=scale.dtype), 0.1 * torch.log(scale) + 1.0 | ||
| ) | ||
| mscale = torch.where( |
There was a problem hiding this comment.
Looks like this mscale computation is a duplicate of lines 612-614. Probably a copy-paste leftover — the second one could be removed.
| self.added_kv_proj_dim = added_kv_proj_dim | ||
| self.added_proj_bias = added_proj_bias | ||
|
|
||
| self.norm_q = torch.nn.RMSNorm(dim_head, eps=eps, elementwise_affine=elementwise_affine) |
There was a problem hiding this comment.
I saw in the earlier review that the maintainer suggested using from vllm.model_executor.layers.layernorm import RMSNorm instead of torch.nn.RMSNorm. Looks like it still needs to be updated here and on lines 300, 311, 312.
| for tok_name in ("tokenizer", "tokenizer_2"): | ||
| tok = getattr(self.pipe, tok_name, None) | ||
| if tok is not None and hasattr(tok, "model_max_length"): | ||
| tok.model_max_length = 512 |
There was a problem hiding this comment.
Just wondering — hardcoding tok.model_max_length = 512 overrides whatever the tokenizer originally had. Would it make sense to read this from the model config instead?
| "LongCatImagePipeline": enable_cache_for_longcat_image, | ||
| "LongCatImageEditPipeline": enable_cache_for_longcat_image, | ||
| "UltraFluxPipeline": enable_cache_for_ultraflux, | ||
| "LongcatImagePipeline": enable_cache_for_longcat_image, |
There was a problem hiding this comment.
The LongCatImagePipeline -> LongcatImagePipeline rename seems like a separate change. Might be worth mentioning in the PR description, or splitting it out if you prefer?
| self.tokenizer_max_length = ( | ||
| self.tokenizer.model_max_length if hasattr(self, "tokenizer") and self.tokenizer is not None else 77 | ||
| ) | ||
| self.default_sample_size = 64 |
There was a problem hiding this comment.
With default_sample_size = 64 and vae_scale_factor = 32, the default resolution would be 2048x2048, but the test plan uses 4096x4096. Is 2048 the intended default?
PLEASE FILL IN THE PR DESCRIPTION HERE ENSURING ALL CHECKLIST ITEMS (AT THE BOTTOM) HAVE BEEN CONSIDERED.
Purpose
Add UltraFlux-v1-image support #327
Test Plan
Test Result
without cache_dit
with cache_dit
Essential Elements of an Effective PR Description Checklist
supported_models.mdandexamplesfor a new model.BEFORE SUBMITTING, PLEASE READ https://github.com/vllm-project/vllm-omni/blob/main/CONTRIBUTING.md (anything written below this line will be removed by GitHub Actions)