Skip to content

Add typing to speech_to_text_finetune.py#15326

Merged
nithinraok merged 5 commits intoNVIDIA-NeMo:mainfrom
Blynt-AI:typing-fixes
Jan 27, 2026
Merged

Add typing to speech_to_text_finetune.py#15326
nithinraok merged 5 commits intoNVIDIA-NeMo:mainfrom
Blynt-AI:typing-fixes

Conversation

@Garvys
Copy link
Contributor

@Garvys Garvys commented Jan 27, 2026

What does this PR do ?

This PR adds typing to the functions used in speech_to_text_finetune.py in order to make the code easier to navigate with an IDE.
Collection: [ASR]

Changelog

  • Added typing to speech_to_text_finetune.py

GitHub Actions CI

The Jenkins CI system has been replaced by GitHub Actions self-hosted runners.

The GitHub Actions CI will run automatically when the "Run CICD" label is added to the PR.
To re-run CI remove and add the label again.
To run CI on an untrusted fork, a NeMo user with write access must first click "Approve and run".

Before your PR is "Ready for review"

Pre checks:

  • Make sure you read and followed Contributor guidelines
  • Did you write any new necessary tests?
  • Did you add or update any necessary documentation?
  • Does the PR affect components that are optional to install? (Ex: Numba, Pynini, Apex etc)
    • Reviewer: Does the PR have correct import guards for all optional libraries?

PR Type:

  • Improvements

If you haven't finished some of the above items you can still open "Draft" PR.

Who can review?

Anyone in the community is free to review the PR once the checks have passed.
Contributor guidelines contains specific people who can review PRs to various areas.

Additional Information

  • Related to # (issue)

@github-actions github-actions bot added the ASR label Jan 27, 2026
Signed-off-by: Alexandre Caulier <alexandre.caulier.a@gmail.com>
Signed-off-by: Alexandre Caulier <alexandre.caulier.a@gmail.com>
@Garvys Garvys marked this pull request as ready for review January 27, 2026 18:02
@Garvys
Copy link
Contributor Author

Garvys commented Jan 27, 2026

Very quick PR if you have a bit of time @nithinraok
thanks a lot

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds type annotations to helper functions in speech_to_text_finetune.py to improve IDE navigation and code maintainability. The changes follow the PR's stated goal of making the code easier to navigate with an IDE by adding explicit type hints.

Changes:

  • Added DictConfig import from omegaconf
  • Added type annotations to get_base_model, check_vocabulary, update_tokenizer, and setup_dataloaders functions in speech_to_text_finetune.py
  • Added partial type annotation to add_global_adapter_cfg function in train_asr_adapter.py

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File Description
examples/asr/speech_to_text_finetune.py Added type annotations (pl.Trainer, DictConfig, ASRModel) to four helper functions, with DictConfig import added to support the changes
examples/asr/asr_adapters/train_asr_adapter.py Added partial type annotation (ASRModel) to the add_global_adapter_cfg function's model parameter

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.



def update_tokenizer(asr_model, tokenizer_dir, tokenizer_type):
def update_tokenizer(asr_model: ASRModel, tokenizer_dir, tokenizer_type) -> ASRModel:
Copy link

Copilot AI Jan 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The type annotations for parameters tokenizer_dir and tokenizer_type are missing. Based on the change_vocabulary method signature in the ASRModel classes (e.g., rnnt_bpe_models.py line 340-344), these should be typed as:

  • tokenizer_dir: Union[str, DictConfig] (can be a directory path string or a DictConfig for 'agg' tokenizer type)
  • tokenizer_type: str

This would also require adding Union to the imports from typing module.

Copilot uses AI. Check for mistakes.


def add_global_adapter_cfg(model, global_adapter_cfg):
def add_global_adapter_cfg(model: ASRModel, global_adapter_cfg):
Copy link

Copilot AI Jan 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The type annotation for parameter global_adapter_cfg is missing. Based on the function body (lines 131-135), this parameter can be a dataclass, DictConfig, or dict. The appropriate type annotation would be Union[DictConfig, dict] since the function checks for these types explicitly. This would require ensuring Union is imported from the typing module.

Copilot uses AI. Check for mistakes.
Signed-off-by: Alexandre Caulier <alexandre.caulier.a@gmail.com>
Signed-off-by: Alexandre Caulier <alexandre.caulier.a@gmail.com>
Copy link
Member

@nithinraok nithinraok left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, LGTM!

@nithinraok nithinraok enabled auto-merge (squash) January 27, 2026 19:12
@nithinraok nithinraok merged commit 09616b9 into NVIDIA-NeMo:main Jan 27, 2026
54 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants