Skip to content

Conversation

@ksikiric
Copy link
Contributor

@ksikiric ksikiric commented Nov 18, 2025

AITER has in later commits exposed multiple rounding modes. By changing mode from 1 (rtna = Round to Nearest Away from Zero) to 2 (rtz = Round Toward Zero) MI300X sees an uplift of 6% E2E for Wan2.2 without any visual changes to the outputs.

Tested with command:
torchrun --nproc_per_node=8 examples/wan_i2v_example.py --height 720 --width 1280 --num_frames 81 --model Wan-AI/Wan2.2-I2V-A14B-Diffusers --ulysses_degree 8 --prompt "Summer beach vacation style, a white cat wearing sunglasses sits on a surfboard. The fluffy-furred feline gazes directly at the camera with a relaxed expression. Blurred beach scenery forms the background featuring crystal-clear waters, distant green hills, and a blue sky dotted with white clouds. The cat assumes a naturally relaxed posture, as if savoring the sea breeze and warm sunlight. A close-up shot highlights the feline's intricate details and the refreshing atmosphere of the seaside." --num_inference_steps 40 --use_torch_compile --seed 42 --img_file_path https://huggingface.co/datasets/YiYiXu/testing-images/resolve/main/wan_i2v_input.JPG

Output from current main:

i2v_output_org.mp4

Output with this PR

i2v_output_rm2.mp4

@gemini-code-assist
Copy link

Summary of Changes

Hello @ksikiric, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request integrates a mechanism to control the rounding mode within AITER's flash attention function, specifically targeting the how_v3_bf16_cvt parameter. By exposing this setting, the change enables performance optimizations, as demonstrated by a 6% E2E uplift for the Wan2.2 model when using a specific round mode. The implementation dynamically checks for the parameter's availability and allows configuration via an environment variable.

Highlights

  • AITER Round Mode Control: Introduces dynamic control over AITER's rounding modes for flash_attn_func by checking for the how_v3_bf16_cvt parameter.
  • Performance Uplift: Achieved a 6% end-to-end performance improvement for the Wan2.2 model by changing the AITER round mode from 1 to 2.
  • Environment Variable Configuration: Allows configuring the AITER round mode via the HOW_V3_BF16_CVT environment variable, defaulting to 2 if the parameter is detected.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@gemini-code-assist
Copy link

Summary of Changes

Hello @ksikiric, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request enhances the xfuser library by adding dynamic control over AITER's rounding modes within flash attention operations. This allows for fine-tuning of numerical precision, which has been shown to yield significant performance gains, such as a 6% end-to-end uplift for the Wan2.2 model, by simply adjusting the rounding mode.

Highlights

  • AITER Round Mode Control: Introduces the ability to control the rounding mode for AITER's flash attention function, specifically targeting the how_v3_bf16_cvt parameter.
  • Performance Uplift: Demonstrates a 6% end-to-end performance improvement for the Wan2.2 model by changing the AITER round mode from 1 to 2.
  • Configurable via Environment Variable: The how_v3_bf16_cvt parameter is now configurable via the HOW_V3_BF16_CVT environment variable, with a default value of 2 if not specified.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@gemini-code-assist
Copy link

Summary of Changes

Hello @ksikiric, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request enhances the xfuser library by integrating dynamic control over AITER's flash attention rounding modes. By detecting the availability of a specific rounding parameter and allowing its configuration via an environment variable, the change enables performance optimizations, as demonstrated by a 6% E2E uplift for the Wan2.2 model.

Highlights

  • AITER Round Mode Control: Introduces the ability to control the rounding mode for AITER's flash attention function, specifically targeting the how_v3_bf16_cvt parameter.
  • Performance Uplift: Changing the round mode from 1 to 2 resulted in a significant 6% end-to-end performance uplift for the Wan2.2 model.
  • Configurable via Environment Variable: The specific round mode can now be configured via the HOW_V3_BF16_CVT environment variable, with a default value of 2 if not explicitly set.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces support for AITER's round mode control, configurable via an environment variable. The changes correctly detect this feature and pass the new parameter to the attention function. My review includes suggestions to make the feature detection more robust against different AITER library versions to prevent potential crashes, and to refactor a section of duplicated code to improve maintainability.

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces support for AITER's round mode control to improve performance. The changes involve detecting if the installed aiter version supports this feature and then using it when available.

My review focuses on improving the robustness of the feature detection and reducing code duplication.

  • The feature detection for the round mode could crash if an older version of aiter is used. I've suggested using a try-except block to handle this gracefully.
  • The new logic for calling the attention function has duplicated code, which I've suggested refactoring for better maintainability.

Kristian Sikiric added 3 commits November 18, 2025 12:18
…removed code duplication when checking if round mode is available when calling aiter flash attention
…s this was a misstake and should not have been changed in the first place.
@ksikiric ksikiric force-pushed the aiter_round_mode_control branch from 715e4cf to 78d1daa Compare November 18, 2025 15:04
@jcaraban jcaraban self-requested a review November 18, 2025 15:08
Copy link
Collaborator

@jcaraban jcaraban left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@feifeibear feifeibear merged commit ccba9d5 into xdit-project:main Nov 19, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants