[Diffusion] Refactor diffusion models weights loading#157
[Diffusion] Refactor diffusion models weights loading#157Isotr0py merged 19 commits intovllm-project:mainfrom
Conversation
Signed-off-by: Isotr0py <[email protected]>
Signed-off-by: Isotr0py <[email protected]>
Signed-off-by: Isotr0py <[email protected]>
Signed-off-by: Isotr0py <[email protected]>
Signed-off-by: Isotr0py <[email protected]>
Signed-off-by: Isotr0py <[email protected]>
Signed-off-by: Isotr0py <[email protected]>
Signed-off-by: Isotr0py <[email protected]>
Signed-off-by: Isotr0py <[email protected]>
|
The account who enabled Codex for this repo no longer has access to Codex. Please contact the admins of this repo to enable Codex again. |
| self.weights_sources = [ | ||
| DiffusersPipelineLoader.ComponentSource( | ||
| model_or_path=od_config.model, | ||
| subfolder="transformer", | ||
| revision=None, | ||
| prefix="transformer.", | ||
| fall_back_to_pt=True, | ||
| ) | ||
| ] |
There was a problem hiding this comment.
I think we should also unifiy text_encoder, vae and other submodule's weights loading along this way in following PRs.
Otherwise AutoModel.from_pretrained has loaded their weights at model initialization stage here, which will make weights-based parallism integration a bit difficult to be handled.
ZJY0516
left a comment
There was a problem hiding this comment.
LGTM. Does this PR support modelscope weight load?
| self.weights_sources = [ | ||
| DiffusersPipelineLoader.ComponentSource( | ||
| model_or_path=od_config.model, | ||
| subfolder="transformer", | ||
| revision=None, | ||
| prefix="transformer.", | ||
| fall_back_to_pt=True, | ||
| ) | ||
| ] |
Yes, it works with the modelscope patch vllm-omni/vllm_omni/entrypoints/omni.py Lines 14 to 22 in f73791d It still can't work without this patch, because some vllm-omni specific weights utils functions don't have vllm-omni/vllm_omni/model_executor/model_loader/weight_utils.py Lines 13 to 19 in f73791d |
Do you know does vllm already provides similar modelscope-compatible utilities? also cc @MengqingCao |
Signed-off-by: Isotr0py <[email protected]>
There was a problem hiding this comment.
it will be good to show the memory usage and time cost for model loading, similar to vllm
(EngineCore_DP0 pid=2964220) INFO 12-04 06:28:57 [default_loader.py:267] Loading weights took 3.21 seconds
(EngineCore_DP0 pid=2964220) INFO 12-04 06:28:57 [gpu_model_runner.py:2653] Model loading took 1.6714 GiB and 5.854078 seconds
There was a problem hiding this comment.
Done in 22f0f84
INFO 12-04 17:45:34 [parallel_state.py:1208] rank 0 in world size 1 is assigned as DP rank 0, PP rank 0, TP rank 0, EP rank 0
INFO:vllm_omni.diffusion.worker.gpu_worker:Worker 0: Initialized device and distributed environment.
Loading checkpoint shards: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████| 3/3 [00:04<00:00, 1.58s/it]
WARNING 12-04 17:45:39 [__init__.py:755] Current vLLM config is not set.
Loading safetensors checkpoint shards: 0% Completed | 0/3 [00:00<?, ?it/s]
Loading safetensors checkpoint shards: 33% Completed | 1/3 [00:03<00:06, 3.48s/it]
Loading safetensors checkpoint shards: 67% Completed | 2/3 [00:06<00:03, 3.34s/it]
Loading safetensors checkpoint shards: 100% Completed | 3/3 [00:08<00:00, 2.50s/it]
Loading safetensors checkpoint shards: 100% Completed | 3/3 [00:08<00:00, 2.74s/it]
INFO:vllm_omni.diffusion.model_loader.diffusers_loader:Loading weights took 8.44 seconds
INFO:vllm_omni.diffusion.worker.gpu_worker:Model loading took 19.2180 GiB and 14.825355 seconds
INFO:vllm_omni.diffusion.worker.gpu_worker:Worker 0: Model loaded successfully.
INFO:vllm_omni.diffusion.worker.gpu_worker:Worker 0: Scheduler loop started.
INFO:vllm_omni.diffusion.worker.gpu_worker:Worker 0 ready to receive requests via shared memory
Signed-off-by: Isotr0py <[email protected]>
Signed-off-by: Isotr0py <[email protected]>
Signed-off-by: Isotr0py <[email protected]>
Signed-off-by: Isotr0py <[email protected]> Signed-off-by: Prajwal A <[email protected]>
Signed-off-by: Isotr0py <[email protected]> Signed-off-by: Prajwal A <[email protected]>
Signed-off-by: Isotr0py <[email protected]> Signed-off-by: Fanli Lin <[email protected]>
Signed-off-by: Isotr0py <[email protected]>
PLEASE FILL IN THE PR DESCRIPTION HERE ENSURING ALL CHECKLIST ITEMS (AT THE BOTTOM) HAVE BEEN CONSIDERED.
Purpose
DiffusersPipelineLoaderto load weights under the subfolder of standard diffusers mode repo.AutoWeightsLoaderfrom main repo to load weights recursively.Test Plan
Test Result
Both Z-Image and Qwen-Image pipeline can still work.
Essential Elements of an Effective PR Description Checklist
supported_models.mdandexamplesfor a new model.BEFORE SUBMITTING, PLEASE READ https://github.com/vllm-project/vllm-omni/blob/main/CONTRIBUTING.md (anything written below this line will be removed by GitHub Actions)