GitHub - biodatlab/thonburian-tts: ThonburianTTS, a finetuned Thai TTS based on the E2-TTS and F5-TTS architectures, designed to improve pronunciation accuracy, alignment robustness, and zero-shot speaker adaptation for the Thai language

🔊 Model Checkpoints | 🤗 Gradio Demo | 📄 ThonburianTTS Paper | Colab Notebook | GitHub

Thonburian TTS

Thonburian TTS is a Thai Text-to-Speech (TTS) engine built on top of the F5-TTS.
It generates natural and expressive Thai speech by leveraging Flow-Matching diffusion techniques and can mimic reference voices from short audio samples. The system supports:

Thai language generation (language="th")
Reference-based voice cloning using short audio clips
High-quality synthesis with controllable speed and silence trimming

Pipeline Overview

This workflow enables:

High-quality Thai speech generation from text
Voice cloning with style and tone preservation
ASR-TTS integration for interactive voice applications

Quick Usage

Below is a minimal example for generating Thai speech with voice cloning using a reference sample.

from flowtts.inference import FlowTTSPipeline, ModelConfig, AudioConfig
import torch

# Configure F5-TTS model
model_config = ModelConfig(
    language="th",
    model_type="F5",
    checkpoint="hf://biodatlab/ThonburianTTS/megaF5/mega_f5_last.safetensors",
    vocab_file="hf://biodatlab/ThonburianTTS/megaF5/mega_vocab.txt",
    vocoder="vocos",
    device="cuda" if torch.cuda.is_available() else "cpu"
)

# Basic audio settings
audio_config = AudioConfig(
    silence_threshold=-45,
    cfg_strength=2.5,
    speed=1.0
)

pipeline = FlowTTSPipeline(model_config, audio_config)

# Input text and reference voice
text = "ยินดีที่ได้รู้จักคุณวันนี้อากาศดีมาก"
ref_voice = "ref_samples/ref_sample.wav"
ref_text = "ยินดีที่ได้รู้จัก"  # Manual transcript of the reference clip

# Generate speech
output_path = pipeline(
    text=text,
    ref_voice=ref_voice,
    ref_text=ref_text,
    output_file="f5_output.wav"
)
print(f"Generated F5 audio saved to: {output_path}")

Installation

Install dependencies:

pip install torch cached-path librosa transformers f5-tts
sudo apt install ffmpeg

Model Checkpoints

Model Component	Description	URL
F5-TTS Thai	Flow Matching-based Thai TTS models	Link
F5-TTS IPA	Flow Matching-based Thai-IPA TTS models	Link

Example Outputs

🎵 Sample 1 – Single-speaker Thai Normal Text

🎵 Sample 2 – Single-Speaker Thai Code-mixed Text

🎵 Sample 3 – Multi-Speaker Conversational Speech

Developers

Citation

If you use ThonburianTTS in your research, please cite:

@INPROCEEDINGS{11320472,
  author={Aung, Thura and Sriwirote, Panyut and Thavornmongkol, Thanachot and Pipatsrisawat, Knot and Achakulvisut, Titipat and Aung, Zaw Htet},
  booktitle={2025 20th International Joint Symposium on Artificial Intelligence and Natural Language Processing (iSAI-NLP)}, 
  title={ThonburianTTS: Enhancing Neural Flow Matching Models for Authentic Thai Text-to-Speech}, 
  year={2025},
  volume={},
  number={},
  pages={1-6},
  keywords={Adaptation models;Codes;Accuracy;Error analysis;Phonetics;Robustness;Natural language processing;Text to speech;Noise measurement;Research and development;Thai text-to-speech;Flow matching;F5-TTS},
  doi={10.1109/iSAI-NLP66160.2025.11320472}}

Thura Aung, Panyut Sriwirote, Thanachot Thavornmongkol, Knot Pipatsrisawat, Titipat Achakulvisut, Zaw Htet Aung, "ThonburianTTS: Enhancing Neural Flow Matching Models for Authentic Thai Text-to-Speech", 2025 20th International Joint Symposium on Artificial Intelligence and Natural Language Processing (iSAI-NLP), Phuket, Thailand, 2025, pp. 1-6, doi: 10.1109/iSAI-NLP66160.2025.11320472.

License

Our codes are released under the MIT License. The models are released under the Creative Commons Attribution Non-Commercial ShareAlike 4.0 License (CC BY-NC-SA 4.0).

Name		Name	Last commit message	Last commit date
Latest commit History 34 Commits
assets		assets
flowtts		flowtts
notebooks		notebooks
.gitignore		.gitignore
LICENSE-CC-BY-NC-SA		LICENSE-CC-BY-NC-SA
LICENSE-MIT		LICENSE-MIT
README.md		README.md
f5tts_thai_example.py		f5tts_thai_example.py
gradio_app.py		gradio_app.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Licenses found

Repository files navigation

Thonburian TTS

Pipeline Overview

Quick Usage

Installation

Model Checkpoints

Example Outputs

Developers

Citation

License

About

Licenses found

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

License

Licenses found

biodatlab/thonburian-tts

Folders and files

Latest commit

History

Repository files navigation

Thonburian TTS

Pipeline Overview

Quick Usage

Installation

Model Checkpoints

Example Outputs

Developers

Citation

License

About

Resources

License

Licenses found

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages