[Diffusion] Refactor diffusion models weights loading by Isotr0py · Pull Request #157 · vllm-project/vllm-omni

Isotr0py · 2025-12-02T03:33:22Z

PLEASE FILL IN THE PR DESCRIPTION HERE ENSURING ALL CHECKLIST ITEMS (AT THE BOTTOM) HAVE BEEN CONSIDERED.

Purpose

This PR introduce DiffusersPipelineLoader to load weights under the subfolder of standard diffusers mode repo.
This PR also reuses the AutoWeightsLoader from main repo to load weights recursively.

Test Plan

python examples/offline_inference/qwen_image/text_to_image.py

python examples/offline_inference/qwen_image/text_to_image.py --model Tongyi-MAI/Z-Image-Turbo --num_inference_steps 10

Test Result

Both Z-Image and Qwen-Image pipeline can still work.

Essential Elements of an Effective PR Description Checklist

The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
The test plan, such as providing test command.
The test results, such as pasting the results comparison before and after, or e2e results
(Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
(Optional) Release notes update. If your change is user facing, please update the release notes draft.

BEFORE SUBMITTING, PLEASE READ https://github.com/vllm-project/vllm-omni/blob/main/CONTRIBUTING.md (anything written below this line will be removed by GitHub Actions)

Signed-off-by: Isotr0py <[email protected]>

…weight-loader

Signed-off-by: Isotr0py <[email protected]>

chatgpt-codex-connector · 2025-12-03T17:06:15Z

The account who enabled Codex for this repo no longer has access to Codex. Please contact the admins of this repo to enable Codex again.

Isotr0py · 2025-12-03T17:14:11Z

vllm_omni/diffusion/models/z_image/pipeline_z_image.py

+        self.weights_sources = [
+            DiffusersPipelineLoader.ComponentSource(
+                model_or_path=od_config.model,
+                subfolder="transformer",
+                revision=None,
+                prefix="transformer.",
+                fall_back_to_pt=True,
+            )
+        ]


I think we should also unifiy text_encoder, vae and other submodule's weights loading along this way in following PRs.

Otherwise AutoModel.from_pretrained has loaded their weights at model initialization stage here, which will make weights-based parallism integration a bit difficult to be handled.

ZJY0516

LGTM. Does this PR support modelscope weight load?

ZJY0516 · 2025-12-04T04:51:08Z

vllm_omni/diffusion/models/z_image/pipeline_z_image.py

+        self.weights_sources = [
+            DiffusersPipelineLoader.ComponentSource(
+                model_or_path=od_config.model,
+                subfolder="transformer",
+                revision=None,
+                prefix="transformer.",
+                fall_back_to_pt=True,
+            )
+        ]


Isotr0py · 2025-12-04T05:53:43Z

Does this PR support modelscope weight load?

Yes, it works with the modelscope patch omni_snapshot_download at entrypont level:

vllm-omni/vllm_omni/entrypoints/omni.py

Lines 14 to 22 in f73791d

    
           def omni_snapshot_download(model_id) -> str: 
        
               # TODO: this is just a workaround for quickly use modelscope, we should support 
        
               # modelscope in weight loading feature instead of using `snapshot_download` 
        
               if os.environ.get("VLLM_USE_MODELSCOPE", False): 
        
                   from modelscope.hub.snapshot_download import snapshot_download 
        
                   return snapshot_download(model_id) 
        
               else: 
        
                   return _dummy_snapshot_download(model_id)

It still can't work without this patch, because some vllm-omni specific weights utils functions don't have modelscope workaround:

vllm-omni/vllm_omni/model_executor/model_loader/weight_utils.py

Lines 13 to 19 in f73791d

    
           def download_weights_from_hf_specific( 
        
               model_name_or_path: str, 
        
               cache_dir: Optional[str], 
        
               allow_patterns: list[str], 
        
               revision: Optional[str] = None, 
        
               ignore_patterns: Optional[Union[str, list[str]]] = None, 
        
           ) -> str:

ZJY0516 · 2025-12-04T06:01:35Z

Does this PR support modelscope weight load?

Yes, it works with the modelscope patch omni_snapshot_download at entrypont level:

vllm-omni/vllm_omni/entrypoints/omni.py

Lines 14 to 22 in f73791d

def omni_snapshot_download(model_id) -> str:

# TODO: this is just a workaround for quickly use modelscope, we should support

# modelscope in weight loading feature instead of using `snapshot_download`

if os.environ.get("VLLM_USE_MODELSCOPE", False):

from modelscope.hub.snapshot_download import snapshot_download

return snapshot_download(model_id)

else:

return _dummy_snapshot_download(model_id)

It still can't work without this patch, because some vllm-omni specific weights utils functions don't have modelscope workaround:

vllm-omni/vllm_omni/model_executor/model_loader/weight_utils.py

Lines 13 to 19 in f73791d

def download_weights_from_hf_specific(

model_name_or_path: str,

cache_dir: Optional[str],

allow_patterns: list[str],

revision: Optional[str] = None,

ignore_patterns: Optional[Union[str, list[str]]] = None,

) -> str:

Do you know does vllm already provides similar modelscope-compatible utilities? also cc @MengqingCao

Signed-off-by: Isotr0py <[email protected]>

SamitHuang · 2025-12-04T06:39:06Z

vllm_omni/diffusion/worker/gpu_worker.py

it will be good to show the memory usage and time cost for model loading, similar to vllm

(EngineCore_DP0 pid=2964220) INFO 12-04 06:28:57 [default_loader.py:267] Loading weights took 3.21 seconds (EngineCore_DP0 pid=2964220) INFO 12-04 06:28:57 [gpu_model_runner.py:2653] Model loading took 1.6714 GiB and 5.854078 seconds

Done in 22f0f84

INFO 12-04 17:45:34 [parallel_state.py:1208] rank 0 in world size 1 is assigned as DP rank 0, PP rank 0, TP rank 0, EP rank 0 INFO:vllm_omni.diffusion.worker.gpu_worker:Worker 0: Initialized device and distributed environment. Loading checkpoint shards: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████| 3/3 [00:04<00:00, 1.58s/it] WARNING 12-04 17:45:39 [__init__.py:755] Current vLLM config is not set. Loading safetensors checkpoint shards: 0% Completed | 0/3 [00:00<?, ?it/s] Loading safetensors checkpoint shards: 33% Completed | 1/3 [00:03<00:06, 3.48s/it] Loading safetensors checkpoint shards: 67% Completed | 2/3 [00:06<00:03, 3.34s/it] Loading safetensors checkpoint shards: 100% Completed | 3/3 [00:08<00:00, 2.50s/it] Loading safetensors checkpoint shards: 100% Completed | 3/3 [00:08<00:00, 2.74s/it] INFO:vllm_omni.diffusion.model_loader.diffusers_loader:Loading weights took 8.44 seconds INFO:vllm_omni.diffusion.worker.gpu_worker:Model loading took 19.2180 GiB and 14.825355 seconds INFO:vllm_omni.diffusion.worker.gpu_worker:Worker 0: Model loaded successfully. INFO:vllm_omni.diffusion.worker.gpu_worker:Worker 0: Scheduler loop started. INFO:vllm_omni.diffusion.worker.gpu_worker:Worker 0 ready to receive requests via shared memory

Signed-off-by: Isotr0py <[email protected]>

vllm_omni/diffusion/worker/gpu_worker.py

ZJY0516

LGTM, generally

Signed-off-by: Isotr0py <[email protected]>

Signed-off-by: Isotr0py <[email protected]> Signed-off-by: Prajwal A <[email protected]>

Signed-off-by: Isotr0py <[email protected]> Signed-off-by: Fanli Lin <[email protected]>

Signed-off-by: Isotr0py <[email protected]>

refactor qwen-image weight loader

606e8d9

Signed-off-by: Isotr0py <[email protected]>

SamitHuang mentioned this pull request Dec 2, 2025

[RFC]: DiT model and feature support enhancement #85

Closed

58 tasks

Isotr0py added 3 commits December 3, 2025 21:46

Merge remote-tracking branch 'upstream/main' into refactor-diffusion-…

26107c3

…weight-loader

draft

fd5c844

Signed-off-by: Isotr0py <[email protected]>

update

abeaa78

Signed-off-by: Isotr0py <[email protected]>

Isotr0py changed the title ~~[Diffusion] Refactor Qwen-image weight loader~~ [Diffusion] Refactor diffusion models weights loading Dec 3, 2025

Isotr0py added 6 commits December 4, 2025 00:35

fix remote repo

c31b78a

Signed-off-by: Isotr0py <[email protected]>

update qwen-image

a6cf65f

Signed-off-by: Isotr0py <[email protected]>

fix qwen-image inference

8a34a90

Signed-off-by: Isotr0py <[email protected]>

code format

c4f56ec

Signed-off-by: Isotr0py <[email protected]>

add todo

9d317a9

Signed-off-by: Isotr0py <[email protected]>

clean

9edf202

Signed-off-by: Isotr0py <[email protected]>

Isotr0py marked this pull request as ready for review December 3, 2025 17:06

Isotr0py requested a review from hsliuustc0106 as a code owner December 3, 2025 17:06

Isotr0py requested review from SamitHuang and ZJY0516 December 3, 2025 17:06

Isotr0py commented Dec 3, 2025

View reviewed changes

ZJY0516 reviewed Dec 4, 2025

View reviewed changes

fix modelscope

5a5eced

Signed-off-by: Isotr0py <[email protected]>

SamitHuang reviewed Dec 4, 2025

View reviewed changes

Isotr0py added 2 commits December 4, 2025 17:46

log mem and time usage

22f0f84

Signed-off-by: Isotr0py <[email protected]>

update npu worker

4cba413

Signed-off-by: Isotr0py <[email protected]>

ZJY0516 mentioned this pull request Dec 5, 2025

[Model] Add Wan2.2 text-to-video support #202

Merged

5 tasks

Merge branch 'main' into refactor-diffusion-weight-loader

6adef55

Isotr0py requested review from SamitHuang and ZJY0516 December 5, 2025 15:55

ZJY0516 reviewed Dec 5, 2025

View reviewed changes

vllm_omni/diffusion/worker/gpu_worker.py Outdated Show resolved Hide resolved

ZJY0516 approved these changes Dec 5, 2025

View reviewed changes

Isotr0py added 3 commits December 6, 2025 00:25

move load_config outside vllm_config

6f8d9c0

Signed-off-by: Isotr0py <[email protected]>

Merge branch 'main' into refactor-diffusion-weight-loader

0a1add8

Merge branch 'main' into refactor-diffusion-weight-loader

701438e

Isotr0py enabled auto-merge (squash) December 6, 2025 07:25

Isotr0py added 2 commits December 6, 2025 18:02

fix doc build

00ebd83

Signed-off-by: Isotr0py <[email protected]>

Merge branch 'main' into refactor-diffusion-weight-loader

a47c970

Isotr0py merged commit 059f45e into vllm-project:main Dec 6, 2025
4 checks passed

Isotr0py deleted the refactor-diffusion-weight-loader branch December 6, 2025 11:15

This was referenced Dec 8, 2025

[Model] Add Qwen-Image-Edit #196

Merged

[Model] [WIP] Add Flux2 Model support #210

Closed

LawJarp-A pushed a commit to LawJarp-A/vllm-omni that referenced this pull request Dec 12, 2025

[Diffusion] Refactor diffusion models weights loading (vllm-project#157)

6f8608e

Signed-off-by: Isotr0py <[email protected]> Signed-off-by: Prajwal A <[email protected]>

LawJarp-A pushed a commit to LawJarp-A/vllm-omni that referenced this pull request Dec 12, 2025

[Diffusion] Refactor diffusion models weights loading (vllm-project#157)

8955efa

Signed-off-by: Isotr0py <[email protected]> Signed-off-by: Prajwal A <[email protected]>

faaany pushed a commit to faaany/vllm-omni that referenced this pull request Dec 19, 2025

[Diffusion] Refactor diffusion models weights loading (vllm-project#157)

14a8595

Signed-off-by: Isotr0py <[email protected]> Signed-off-by: Fanli Lin <[email protected]>

princepride pushed a commit to princepride/vllm-omni that referenced this pull request Jan 10, 2026

[Diffusion] Refactor diffusion models weights loading (vllm-project#157)

c16ebbf

Signed-off-by: Isotr0py <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Comments

[Diffusion] Refactor diffusion models weights loading#157

[Diffusion] Refactor diffusion models weights loading#157
Isotr0py merged 19 commits intovllm-project:mainfrom
Isotr0py:refactor-diffusion-weight-loader

Isotr0py commented Dec 2, 2025 •

edited

Loading

Uh oh!

chatgpt-codex-connector bot commented Dec 3, 2025

Uh oh!

Isotr0py Dec 3, 2025

Uh oh!

ZJY0516 Dec 4, 2025

Uh oh!

ZJY0516 left a comment

Uh oh!

ZJY0516 Dec 4, 2025

Uh oh!

Isotr0py commented Dec 4, 2025

Uh oh!

ZJY0516 commented Dec 4, 2025

Uh oh!

SamitHuang Dec 4, 2025

Uh oh!

Isotr0py Dec 4, 2025

Uh oh!

Uh oh!

ZJY0516 left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Comments

Conversation

Isotr0py commented Dec 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Test Plan

Test Result

Uh oh!

chatgpt-codex-connector bot commented Dec 3, 2025

Uh oh!

Isotr0py Dec 3, 2025

Choose a reason for hiding this comment

Uh oh!

ZJY0516 Dec 4, 2025

Choose a reason for hiding this comment

Uh oh!

ZJY0516 left a comment

Choose a reason for hiding this comment

Uh oh!

ZJY0516 Dec 4, 2025

Choose a reason for hiding this comment

Uh oh!

Isotr0py commented Dec 4, 2025

Uh oh!

ZJY0516 commented Dec 4, 2025

Uh oh!

SamitHuang Dec 4, 2025

Choose a reason for hiding this comment

Uh oh!

Isotr0py Dec 4, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

ZJY0516 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Isotr0py commented Dec 2, 2025 •

edited

Loading