Skip to content

Conversation

@e1ijah1
Copy link

@e1ijah1 e1ijah1 commented Dec 4, 2025

Summary

  • This PR adds the missing get_byt5_text_tokens().
  • Running the LightX2VPipeline t2v demo (hunyuan_video_1.5) fails when the prompt contains quotation marks (e.g., A "shiny" robot walking).
  • Root cause: _process_single_byt5_prompt calls get_byt5_text_tokens(), but ByT5TextEncoder lacks this method, raising AttributeError.

Demo code:

from lightx2v import LightX2VPipeline

# Initialize pipeline
pipe = LightX2VPipeline(
    model_path="/path/to/hunyuanvideo-1.5/",  # Original model path
    model_cls="hunyuan_video_1.5",
    transformer_model_name="480p_t2v",
    task="t2v",
    # 4-step distilled model ckpt
    dit_original_ckpt="/path/to/hy1.5_t2v_480p_lightx2v_4step.safetensors"
)

# Enable FP8 quantization for the distilled model
pipe.enable_quantize(
    quant_scheme='fp8-sgl',
    dit_quantized=True,
    dit_quantized_ckpt="/path/to/hy1.5_t2v_480p_scaled_fp8_e4m3_lightx2v_4step.safetensors",
    text_encoder_quantized=False,  # Optional: can also quantize text encoder
    text_encoder_quantized_ckpt="/path/to/hy15_qwen25vl_llm_encoder_fp8_e4m3_lightx2v.safetensors",  # Optional
    image_encoder_quantized=False,
)

# Enable offloading for lower VRAM usage
pipe.enable_offload(
    cpu_offload=True,
    offload_granularity="block",
    text_encoder_offload=True,
    image_encoder_offload=False,
    vae_offload=False,
)

# Create generator
pipe.create_generator(
    attn_mode="sage_attn2",
    infer_steps=4,
    num_frames=81,
    guidance_scale=1,
    sample_shift=9.0,
    aspect_ratio="16:9",
    fps=16,
    denoising_step_list=[1000, 750, 500, 250]
)

# Generate video
pipe.generate(
    seed=123,
    prompt='A "shiny" robot walking',
    negative_prompt="",
    save_result_path="/path/to/output.mp4",
)

Error

Traceback (most recent call last):
  File "/weights/lab/LightX2V/run.py", line 69, in <module>
    pipe.generate(
  File "/opt/conda/lib/python3.11/site-packages/torch/utils/_contextlib.py", line 120, in decorate_context
    return func(*args, **kwargs)
  File "/weights/lab/LightX2V/lightx2v/pipeline.py", line 346, in generate
    self.runner.run_pipeline(input_info)
  File "/weights/lab/LightX2V/lightx2v/utils/profiler.py", line 80, in sync_wrapper
    return func(*args, **kwargs)
  File "/weights/lab/LightX2V/lightx2v/models/runners/default_runner.py", line 381, in run_pipeline
    self.inputs = self.run_input_encoder()
  File "/weights/lab/LightX2V/lightx2v/utils/profiler.py", line 80, in sync_wrapper
    return func(*args, **kwargs)
  File "/weights/lab/LightX2V/lightx2v/models/runners/hunyuan_video/hunyuan_video_15_runner.py", line 436, in _run_input_encoder_local_t2v
    text_encoder_output = self.run_text_encoder(self.input_info)
  File "/weights/lab/LightX2V/lightx2v/models/runners/hunyuan_video/hunyuan_video_15_runner.py", line 290, in run_text_encoder
    byt5_features, byt5_masks = self.text_encoders[1].infer([prompt])
  File "/opt/conda/lib/python3.11/site-packages/torch/utils/_contextlib.py", line 120, in decorate_context
    return func(*args, **kwargs)
  File "/weights/lab/LightX2V/lightx2v/models/input_encoders/hf/hunyuan15/byt5/model.py", line 333, in infer
    byt5_embeddings, byt5_masks = self._prepare_byt5_embeddings(prompts)
  File "/weights/lab/LightX2V/lightx2v/models/input_encoders/hf/hunyuan15/byt5/model.py", line 304, in _prepare_byt5_embeddings
    pos_emb, pos_mask = self._process_single_byt5_prompt(prompt, AI_DEVICE)
  File "/weights/lab/LightX2V/lightx2v/models/input_encoders/hf/hunyuan15/byt5/model.py", line 280, in _process_single_byt5_prompt
    text_ids, text_mask = self.get_byt5_text_tokens(self.byt5_tokenizer, self.byt5_max_length, formatted_text)
AttributeError: 'ByT5TextEncoder' object has no attribute 'get_byt5_text_tokens'

@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @e1ijah1, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request addresses a critical bug in the LightX2VPipeline's text encoding mechanism. By introducing a missing tokenization utility, it ensures that text prompts, especially those with special characters like quotation marks, are correctly processed by the ByT5 text encoder. This enhancement improves the stability and reliability of the text-to-video generation pipeline, preventing runtime errors and allowing for a wider range of prompt inputs.

Highlights

  • Missing Method Added: Implemented the get_byt5_text_tokens() static method within the ByT5TextEncoder class to handle text tokenization for the ByT5 model.
  • Bug Fix: Resolved an AttributeError that occurred when _process_single_byt5_prompt attempted to call the previously non-existent get_byt5_text_tokens() method.
  • Improved Robustness: Fixed an issue preventing the LightX2VPipeline t2v demo from running successfully with prompts containing quotation marks, ensuring proper text processing.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request successfully addresses the AttributeError by adding the missing get_byt5_text_tokens() static method to the ByT5TextEncoder class. The new method correctly handles the tokenization of text prompts for the ByT5 model, ensuring that prompts containing quotation marks are processed without error. The implementation is clear and directly resolves the reported issue.

@helloyongyang
Copy link
Contributor

helloyongyang commented Dec 4, 2025

Hello, thank you for your submission. However, you forgot to add "self" in the get_byt5_text_tokens function. You can refer to the commit for the fix. By the way, I noticed that your profile is looking for internship opportunities. I saw that your PR information is very well-organized and looks great. Feel free to share your contact details (email), and we can discuss internship opportunities. @e1ijah1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants