Skip to content

Conversation

@avjves
Copy link
Contributor

@avjves avjves commented Nov 18, 2025

What?

Adds support for Wan 2.X T2V/TI2V models.

Why?

In PR #583, we added support for Wan 2.X I2V models. This PR adds support for the other two tasks.

How?

Combines the wan_i2v_example.py into a single wan_example.py file that supports all three tasks.
The desired task must be specified by a required --task parameter.

Tests

Wan 2.2 I2V:

i2v_output.mp4

Run command:

torchrun --nproc_per_node=8 examples/wan_example.py     --height 720 --width 1280 --num_frames 81     --model Wan-AI/Wan2.2-I2V-A14B-Diffusers     --ulysses_degree 8     --prompt "Summer beach vacation style, a white cat wearing sunglasses sits on a surfboard. The fluffy-furred feline gazes directly at the camera with a relaxed expression. Blurred beach scenery forms the background featuring crystal-clear waters, distant green hills, and a blue sky dotted with white clouds. The cat assumes a naturally relaxed posture, as if savoring the sea breeze and warm sunlight. A close-up shot highlights the feline's intricate details and the refreshing atmosphere of the seaside." --task i2v --img_file_path https://huggingface.co/datasets/YiYiXu/testing-images/resolve/main/wan_i2v_input.JPG 

Wan 2.2 T2V:

t2v_output.mp4

Run command:

torchrun --nproc_per_node=8 examples/wan_example.py     --height 720 --width 1280 --num_frames 81     --model Wan-AI/Wan2.2-T2V-A14B-Diffusers     --ulysses_degree 8     --prompt "Intense action sequence: A resolute woman bolts down a stormy city street, her face etched with determination. Camera tracks her lightning-fast rocketing sprint, dodging amidst blasts and blinding lights. Ultra-clear, immersive motion trails, parallax effect in high-definition. Close-up, dynamic perspective." --task t2v

Wan 2.2 TI2V:

ti2v_output.mp4

Run command:

torchrun --nproc_per_node=8 examples/wan_example.py     --height 720 --width 1280 --num_frames 121     --model Wan-AI/Wan2.2-TI2V-5B-Diffusers     --ulysses_degree 8     --prompt "Intense action sequence: A resolute woman bolts down a stormy city street, her face etched with determination. Camera tracks her lightning-fast rocketing sprint, dodging amidst blasts and blinding lights. Ultra-clear, immersive motion trails, parallax effect in high-definition. Close-up, dynamic perspective." --task ti2v

Wan 2.1 TIV:

t2v_output.mp4

Run command:

torchrun --nproc_per_node=4 examples/wan_example.py     --height 720 --width 1280 --num_frames 81     --model Wan-AI/Wan2.1-T2V-14B-Diffusers     --ulysses_degree 4     --prompt "Intense action sequence: A resolute woman bolts down a stormy city street, her face etched with determination. Camera tracks her lightning-fast rocketing sprint, dodging amidst blasts and blinding lights. Ultra-clear, immersive motion trails, parallax effect in high-definition. Close-up, dynamic perspective." --task t2v --img_file_path /app/Wan2.1/examples/i2v_input.JPG --num_inference_steps 40

@gemini-code-assist
Copy link

Summary of Changes

Hello @avjves, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the capabilities for Wan 2.X models by integrating support for T2V and TI2V functionalities, building upon the existing I2V support. It also refactors the example usage into a unified script, improving user experience and simplifying model interaction across different tasks.

Highlights

  • Extended Wan 2.X Model Support: Added comprehensive support for Wan 2.X Text-to-Video (T2V) and Text-to-Image-to-Video (TI2V) models, building upon the existing Image-to-Video (I2V) capabilities.
  • Unified Example Script: Consolidated the example scripts for Wan models into a single wan_example.py file, which now supports all three tasks (I2V, T2V, TI2V) through a new, required --task command-line argument.
  • README Documentation Update: Updated the README.md to accurately reflect the newly supported Wan 2.1 and Wan 2.2 models, including their specific entries in the model compatibility and diffusers version requirement tables.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request adds support for Wan 2.X T2V/TI2V models by generalizing the example script. The changes are mostly well-structured, but I've identified two critical issues in examples/wan_example.py related to image handling. The validation for the input image path is overly strict, and the logic for loading and processing the image is incorrect for certain tasks. I've provided detailed feedback and code suggestions to address these problems.

Copy link
Collaborator

@feifeibear feifeibear left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@avjves
Copy link
Contributor Author

avjves commented Nov 18, 2025

Tested on NV as well:
https://github.com/user-attachments/assets/e2fcd685-f2db-461a-9d16-08b7431398ae

Run command:

torchrun --nproc_per_node=8 examples/wan_example.py     --height 540 --width 960 --num_frames 121     --model Wan-AI/Wan2.2-T2V-A14B-Diffusers     --ulysses_degree 8     --prompt "Intense action sequence: A resolute woman bolts down a stormy city street, her face etched with determination. Camera tracks her lightning-fast rocketing sprint, dodging amidst blasts and blinding lights. Ultra-clear, immersive motion trails, parallax effect in high-definition. Close-up, dynamic perspective." --task t2v 

@feifeibear feifeibear merged commit 78cb759 into xdit-project:main Nov 19, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants