Port of OpenAI's Whisper model in C/C++
-
Updated
Oct 28, 2025 - C++
Port of OpenAI's Whisper model in C/C++
DeepSpeech is an open source embedded (offline, on-device) speech-to-text engine which can run in real time on devices ranging from a Raspberry Pi 4 to high power GPU servers.
Faster Whisper transcription with CTranslate2
WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)
🧠 Leon is your open-source personal assistant.
kaldi-asr/kaldi is the official location of the Kaldi project.
Translate the video from one language to another and add dubbing.视频翻译/语音转录/字幕配音工具
Offline speech recognition API for Android, iOS, Raspberry Pi and servers with Python, Java, C# and Node
A PyTorch-based Speech Toolkit
Speech recognition module for Python, supporting several engines and APIs, online and offline.
A robust, efficient, low-latency speech-to-text library with advanced voice activity detection, wake word activation and instant transcription.
Speech-to-text, text-to-speech, speaker diarization, speech enhancement, source separation, and VAD using next-gen Kaldi with onnxruntime without Internet connection. Support embedded systems, Android, iOS, HarmonyOS, Raspberry Pi, RISC-V, RK NPU, Ascend NPU, x86_64 servers, websocket server/client, support 12 programming languages
A Deep-Learning-Based Chinese Speech Recognition System 基于深度学习的中文语音识别系统
Multilingual Voice Understanding Model
💬 Speech recognition for your site
Silero Models: pre-trained text-to-speech models made embarrassingly simple
Automatic Speech Recognition with Speaker Diarization based on OpenAI Whisper
Open-source, accurate and easy-to-use video speech recognition & clipping tool, LLM based AI clipping intergrated.
A free, open source, and extensible speech-to-text application that works completely offline.
Gradio WebUI for creators and developers, featuring key TTS (Edge-TTS, kokoro) and zero-shot Voice Cloning (E2 & F5-TTS, CosyVoice), with Whisper audio processing, YouTube download, Demucs vocal isolation, and multilingual translation.
Add a description, image, and links to the speech-to-text topic page so that developers can more easily learn about it.
To associate your repository with the speech-to-text topic, visit your repo's landing page and select "manage topics."