Skip to content

Comments

[Feature] Support cache-dit for Wan 2.2 inference#1021

Merged
david6666666 merged 5 commits intovllm-project:mainfrom
SamitHuang:wan_cache
Jan 28, 2026
Merged

[Feature] Support cache-dit for Wan 2.2 inference#1021
david6666666 merged 5 commits intovllm-project:mainfrom
SamitHuang:wan_cache

Conversation

@SamitHuang
Copy link
Collaborator

@SamitHuang SamitHuang commented Jan 28, 2026

PLEASE FILL IN THE PR DESCRIPTION HERE ENSURING ALL CHECKLIST ITEMS (AT THE BOTTOM) HAVE BEEN CONSIDERED.

Purpose

Accelerate Wan2.2 series generation.

  • Fix the bug of dummy run impact on diffusion cache refresh
  • Fix and tune cache-dit config for higher performance of the Wan2.2 pipelines

Test Plan

python text_to_video.py \
  --prompt "Two anthropomorphic cats in comfy boxing gear and bright gloves fight intensely on a spotlighted stage." \
  --negative_prompt "色调艳丽,过曝,静态,细节模糊不清,字幕,风格,作品,画作,画面,静止,整体发灰,最差质量,低质量,JPEG压缩残留,丑陋的,残缺的,多余的手指,画得不好的手部,画得不
好的脸部,畸形的,毁容的,形态畸形的肢体,手指融合,静止不动的画面,杂乱的背景,三条腿,背景人很多,倒着走" \
  --height 480 \
  --width 832 \
  --num_frames 33 \
  --guidance_scale 4.0 \
  --guidance_scale_high 4.0 \
  --boundary_ratio  0.875 \
  --num_inference_steps 40 \
  --fps 16 \
  --cache_backend cache_dit \
  --enable-cache-dit-summary \
  --output t2v_out.mp4

Test Result

'CFG Cache Steps: 15, [2, 3, 4, 6, 7, 8, 10, 11, 12, 14, 15, 16, 18, 19, 21]'
("CFG Residual Diffs: 17, {'2': 0.03515625, '3': 0.06494140625, '4': "
 "0.0966796875, '6': 0.0390625, '7': 0.0732421875, '8': 0.11083984375, '10': "
 "0.047607421875, '11': 0.091796875, '12': 0.140625, '14': 0.060546875, '15': "
 "0.11962890625, '16': 0.185546875, '18': 0.08984375, '19': 0.185546875, '20': "
 "0.306640625, '21': 0.1796875, '22': 0.44140625}")
[Stage-0] INFO 01-28 02:21:52 [diffusion_engine.py:74] Generation completed successfully.
[Stage-0] INFO 01-28 02:21:52 [diffusion_engine.py:97] Post-processing completed in 0.1659 seconds
INFO 01-28 02:21:53 [log_utils.py:550] {'type': 'request_level_metrics',
INFO 01-28 02:21:53 [log_utils.py:550]  'request_id': '0_fe657b0b-4321-40d4-893c-1c6d71568240',
INFO 01-28 02:21:53 [log_utils.py:550]  'e2e_time_ms': 33146.83222770691,
INFO 01-28 02:21:53 [log_utils.py:550]  'e2e_tpt': 0.0,
INFO 01-28 02:21:53 [log_utils.py:550]  'e2e_total_tokens': 0,
INFO 01-28 02:21:53 [log_utils.py:550]  'transfers_total_time_ms': 0.0,
INFO 01-28 02:21:53 [log_utils.py:550]  'transfers_total_bytes': 0,
INFO 01-28 02:21:53 [log_utils.py:550]  'stages': {0: {'stage_gen_time_ms': 32580.10959625244,
INFO 01-28 02:21:53 [log_utils.py:550]                 'num_tokens_out': 0,
INFO 01-28 02:21:53 [log_utils.py:550]                 'num_tokens_in': 0}}}
Processed prompts: 100%|██████████████████████████████████████████████████████████████████████████████| 1/1 [00:33<00:00, 33.15s/img, est. speed stage-0 img/s: 0.00, avg e2e_lat: 0.0ms]
INFO 01-28 02:21:53 [omni.py:833] [Summary] {'e2e_requests': 1,███████████████████████████████████████| 1/1 [00:33<00:00, 33.15s/img, est. speed stage-0 img/s: 0.00, avg e2e_lat: 0.0ms]
INFO 01-28 02:21:53 [omni.py:833]  'e2e_total_time_ms': 33148.048400878906,
INFO 01-28 02:21:53 [omni.py:833]  'e2e_sum_time_ms': 33146.83222770691,
t2v_out.cachedit.mp4

E2E time is reduced from 81.55s to 33s, 2.47x acceleration.


Essential Elements of an Effective PR Description Checklist
  • The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
  • The test plan, such as providing test command.
  • The test results, such as pasting the results comparison before and after, or e2e results
  • (Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
  • (Optional) Release notes update. If your change is user facing, please update the release notes draft.

BEFORE SUBMITTING, PLEASE READ https://github.com/vllm-project/vllm-omni/blob/main/CONTRIBUTING.md (anything written below this line will be removed by GitHub Actions)

Signed-off-by: samithuang <[email protected]>
Signed-off-by: samithuang <[email protected]>
Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: b19b0805c1

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Signed-off-by: samithuang <[email protected]>
@ZJY0516 ZJY0516 added the ready label to trigger buildkite CI label Jan 28, 2026
@ZJY0516 ZJY0516 added this to the v0.14.0 milestone Jan 28, 2026
@david6666666
Copy link
Collaborator

LGTM

@david6666666 david6666666 merged commit c0beebb into vllm-project:main Jan 28, 2026
6 of 7 checks passed
dongbo910220 pushed a commit to dongbo910220/vllm-omni that referenced this pull request Feb 1, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ready label to trigger buildkite CI

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants