Skip to content

Conversation

@tjkemp
Copy link
Contributor

@tjkemp tjkemp commented Jul 31, 2025

Summary

Batch inference was using the mask from first item (encoder_attention_mask[0]) in hunyan_video_usp_example.py, which works for bs=1 but cuts off masks for longers prompts and produces artifacts. This PR provides a fix.

How to reproduce

  1. Apply patch to make script save all videos
diff --git a/examples/hunyuan_video_usp_example.py b/examples/hunyuan_video_usp_example.py
index 03856f1..a9041ac 100644
--- a/examples/hunyuan_video_usp_example.py
+++ b/examples/hunyuan_video_usp_example.py
@@ -297,7 +291,7 @@ def main():
         guidance_scale=input_config.guidance_scale,
         generator=torch.Generator(device="cuda").manual_seed(
             input_config.seed),
-    ).frames[0]
+    )
 
     end_time = time.time()
     elapsed_time = end_time - start_time
@@ -311,9 +305,10 @@ def main():
     )
     if is_dp_last_group():
         resolution = f"{input_config.width}x{input_config.height}"
-        output_filename = f"results/hunyuan_video_{parallel_info}_{resolution}.mp4"
-        export_to_video(output, output_filename, fps=15)
-        print(f"output saved to {output_filename}")
+        for idx, frames in enumerate(output.frames, start=1):
+            output_filename = f"results/hunyuan_video_{idx:02d}_{parallel_info}_{resolution}.mp4"
+            export_to_video(frames, output_filename, fps=15)
+            print(f"output saved to {output_filename}")
 
     if get_world_group().rank == get_world_group().world_size - 1:
         print(
  1. Run an example with an added second prompt.
mkdir -p results && torchrun --nproc_per_node=2 examples/hunyuan_video_usp_example.py --model tencent/HunyuanVideo --ulysses_degree 2 --num_inference_steps 30 --warmup_steps 0 --prompt "A husky puppy plays with its own tail." "Two Siamese cats eat sushi from a plate." --height 320 --width 512 --num_frames 61 --enable_tiling --enable_model_cpu_offload
  1. Compare results

The second video before this fix:
hunyuan_video_02

The second video after applying this fix:
hunyuan_video_02

Notes

Unrelated to this change, I’m seeing:

[rank1]:   File "/app/xDiT/examples/hunyuan_video_usp_example.py", line 297, in main
[rank1]:     guidance_scale=input_config.guidance_scale,
[rank1]:                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank1]: AttributeError: 'InputConfig' object has no attribute 'guidance_scale'

@tjkemp tjkemp changed the title Fix mask handling for batch generation Fix mask handling for batch generation in HunyuanVideo example Jul 31, 2025
@feifeibear
Copy link
Collaborator

guidance_scale error probably comes from capability of diffusers version

@tjkemp
Copy link
Contributor Author

tjkemp commented Sep 2, 2025

Is there anything else I could adjust to help get this PR ready for merge?

Copy link
Collaborator

@jcaraban jcaraban left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These changes work after #582 was merged

@jcaraban jcaraban merged commit e832379 into xdit-project:main Oct 28, 2025
@jcaraban jcaraban mentioned this pull request Nov 12, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants