Typhoon ASR Real-Time

Typhoon ASR Real-Time is a next-generation, open-source Automatic Speech Recognition (ASR) model built for real-world streaming applications in the Thai language. It delivers fast and accurate transcriptions while running efficiently on standard CPUs, enabling anyone to host their own ASR service without expensive hardware or sending sensitive data to third-party clouds.

This repository provides a simple command-line script to demonstrate the performance and features of the Typhoon ASR Real-Time model.

See the blog for more detail: https://opentyphoon.ai/blog/th/typhoon-asr-realtime-release

Quick Start with Google Colab

For a hands-on demonstration without any local setup, you can run this project directly in Google Colab. The notebook provides a complete environment to transcribe audio files and experiment with the model.

Features

Simple Command-Line Interface: Transcribe Thai audio files directly from your terminal.
Multiple Audio Formats: Supports a wide range of audio inputs, including .wav, .mp3, .m4a, .flac, and more.
Estimated Timestamps: Generate word-level timestamps for your transcriptions.
Hardware Flexible: Run inference on either a CPU or a CUDA-enabled GPU.
Streaming Architecture: Based on a state-of-the-art FastConformer model designed for low-latency, real-time applications.
Language: Thai

Requirements

Linux / Mac (Windows is not officially supported at the moment)
Python 3.10

Install

Clone the repository:

git clone [email protected]:scb-10x/typhoon-asr.git
cd typhoon-asr

Install the required dependencies:
```
pip install -r requirements.txt
```

Usage

Option 1: Using the typhoon-asr Package

Install and use the packaged version:

# Install the package
pip install typhoon-asr

# Command line usage
typhoon-asr path/to/your_audio.wav
typhoon-asr path/to/your_audio.wav --with-timestamps --device cuda

# Python API usage
from typhoon_asr import transcribe

result = transcribe("path/to/your_audio.wav")
print(result['text'])

# With timestamps
result = transcribe("path/to/your_audio.wav", with_timestamps=True)

Option 2: Using the Direct Script

Use the typhoon_asr_inference.py script to transcribe an audio file. The script will automatically handle audio resampling and processing.

Basic Transcription (CPU):

python typhoon_asr_inference.py path/to/your_audio.m4a

Transcription with Estimated Timestamps:

python typhoon_asr_inference.py path/to/your_audio.wav --with-timestamps

Transcription on a GPU:

python typhoon_asr_inference.py path/to/your_audio.mp3 --device cuda

Arguments

input_file: (Required) The path to your input audio file.
--with-timestamps: (Optional) Flag to generate and display estimated word timestamps.
--device: (Optional) The device to run inference on. Choices: auto, cpu, cuda. Defaults to auto.

Example Output

$ python typhoon_asr_inference.py audio/sample_th.wav --with-timestamps

🌪️ Typhoon ASR Real-Time Inference
==================================================
🎵 Processing audio: sample_th.wav
   Original: 48000 Hz, 4.5s
   Resampled: 48000 Hz → 16000 Hz
✅ Processed: processed_sample_th.wav
🌪️ Loading Typhoon ASR Real-Time model...
   Device: CPU
🕐 Running transcription with timestamp estimation...

==================================================
📝 TRANSCRIPTION RESULTS
==================================================
Mode: with timestamps
File: sample_th.wav
Duration: 4.5s
Processing: 1.32s
RTF: 0.293x 🚀 (Real-time capable!)

Transcription:
'ทดสอบการแปลงเสียงเป็นข้อความภาษาไทยแบบเรียลไทม์'

🕐 Word Timestamps (estimated):
---------------------------------------------
 1. [  0.00s -   0.56s] ทดสอบการแปลงเสียงเป็นข้อความภาษาไทยแบบเรียลไทม์

🧹 Cleaned up temporary file: processed_sample_th.wav

✅ Processing complete!

Dependencies

NVIDIA NeMo Toolkit (nemo_toolkit[asr])
PyTorch (torch)
Librosa (librosa)
SoundFile (soundfile)

License

This project is licensed under the Apache 2.0 License. See individual datasets and checkpoints for their respective licenses.

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
examples		examples
packages/typhoon_asr		packages/typhoon_asr
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt
typhoon_asr_inference.py		typhoon_asr_inference.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Typhoon ASR Real-Time

Quick Start with Google Colab

Features

Requirements

Install

Usage

Option 1: Using the typhoon-asr Package

Option 2: Using the Direct Script

Arguments

Example Output

Dependencies

License

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

License

scb-10x/typhoon-asr

Folders and files

Latest commit

History

Repository files navigation

Typhoon ASR Real-Time

Quick Start with Google Colab

Features

Requirements

Install

Usage

Option 1: Using the typhoon-asr Package

Option 2: Using the Direct Script

Arguments

Example Output

Dependencies

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages