[Bugfix] Add Fake VllmConfig on NPU and add slicing/tiling args in Qwen-Image#145
[Bugfix] Add Fake VllmConfig on NPU and add slicing/tiling args in Qwen-Image#145Gaohan123 merged 10 commits intovllm-project:mainfrom
Conversation
Signed-off-by: gcanlin <[email protected]>
Signed-off-by: gcanlin <[email protected]>
Signed-off-by: gcanlin <[email protected]>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| return "fake" | ||
|
|
||
| vllm_config = VllmConfig(model_config=ModelConfig(hf_config=FakePretrainedConfig())) |
There was a problem hiding this comment.
Instantiate ModelConfig without required model
NPU worker initialization now builds ModelConfig(hf_config=FakePretrainedConfig()) without supplying the mandatory model path. ModelConfig’s constructor/post-init requires a model identifier and will raise a TypeError before set_current_vllm_config runs, so the NPU worker process crashes during startup and diffusion on NPU never initializes.
Useful? React with 👍 / 👎.
|
@Gaohan123 @ZJY0516 Please also take a look. Thanks! I think that we could also consider make |
Signed-off-by: gcanlin <[email protected]>
Could you please use env vars in this pr and update related doc |
vllm_omni/diffusion/registry.py
Outdated
| model.vae.use_slicing = True | ||
| model.vae.use_tiling = True |
There was a problem hiding this comment.
There was a problem hiding this comment.
I think we can take it as input argument for model initialization. Both GPU and NPU or other hardware can use it. Later for diaggregation serving, it can be an argument for diffuion stage.
IMO, adding an env var is better if GPU also need this, we don't wanna introduce npu specific env var except there is no way to address it. WDYT? |
I think it's also needed by gpu. cc @SamitHuang |
|
Also cc @Gaohan123 @hsliuustc0106 @ywang96, seems that it needs to introduce the env module, which affects the user-facing side. |
Signed-off-by: gcanlin <[email protected]>
Signed-off-by: gcanlin <[email protected]>
Signed-off-by: gcanlin <[email protected]>
vllm_omni/diffusion/registry.py
Outdated
| model.vae.use_slicing = True | ||
| model.vae.use_tiling = True |
There was a problem hiding this comment.
I think we can take it as input argument for model initialization. Both GPU and NPU or other hardware can use it. Later for diaggregation serving, it can be an argument for diffuion stage.
Signed-off-by: gcanlin <[email protected]>
|
@ywang96 Please make sure this PR merged before first RC, thanks |
…en-Image (vllm-project#145) Signed-off-by: gcanlin <[email protected]> Signed-off-by: Prajwal A <[email protected]>
…en-Image (vllm-project#145) Signed-off-by: gcanlin <[email protected]>


PLEASE FILL IN THE PR DESCRIPTION HERE ENSURING ALL CHECKLIST ITEMS (AT THE BOTTOM) HAVE BEEN CONSIDERED.
Purpose
vllm_config.model_config.hf_config.to_dict()that do not check whethervllm_configis null.Test Plan
Test Result
Essential Elements of an Effective PR Description Checklist
supported_models.mdandexamplesfor a new model.BEFORE SUBMITTING, PLEASE READ https://github.com/vllm-project/vllm-omni/blob/main/CONTRIBUTING.md (anything written below this line will be removed by GitHub Actions)