Skip to content

SuperWhisper-like voice dictation for Linux with waveform UI

License

Notifications You must be signed in to change notification settings

knowall-ai/turbo-whisper

Repository files navigation

Turbo Whisper

Turbo Whisper is a free, open source voice dictation and transcription app for Linux, macOS, and Windows. A SuperWhisper alternative with a beautiful GUI for real-time speech to text (STT). Supports 99 languages via OpenAI Whisper. Perfect for accessibility, RSI, and hands-free typing.

Voice dictation | Speech to text (STT) | Voice typing | Transcription | Open source | Multilingual | Hands-free

License Python Platform AUR PPA

Screencast_20260122_152835.webm

Features

  • Global hotkey (Ctrl+Shift+Space) to start/stop recording from anywhere
  • Waveform visualization - see your audio levels in real-time with an animated orb
  • OpenAI API compatible - works with OpenAI Whisper API or self-hosted faster-whisper-server
  • Multilingual - supports 99 languages via Whisper
  • Auto-type - transcribed text is typed directly into the focused window
  • Clipboard support - text is also copied to clipboard
  • System tray - runs quietly in the background with autostart support
  • Cross-platform - Linux, macOS, and Windows support
  • Accessibility - great for RSI, carpal tunnel, or anyone preferring hands-free input

Perfect for AI CLI Tools

Turbo Whisper is ideal for voice input with terminal-based AI tools:

Simply press the hotkey, speak your prompt, and the transcription is typed directly into your terminal.

Claude Code Integration (Experimental)

Note: This feature is experimental and has limitations. See issue #23 for planned improvements.

When dictating into Claude Code, you may want to wait until Claude finishes responding before typing your text. Turbo Whisper has built-in support for this.

How it works:

  1. Turbo Whisper runs an HTTP server on localhost:7878
  2. After transcription, it waits up to 2 seconds for a "ready" signal
  3. When Claude Code sends the signal, the text is typed

Setup:

  1. Enable in your config (~/.config/turbo-whisper/config.json):
{
  "claude_integration": true,
  "claude_integration_port": 7878
}
  1. Create a Claude Code hook at ~/.claude/hooks/post-response.sh:
#!/bin/bash
# Signal Turbo Whisper that Claude is ready for input
curl -s -X POST http://localhost:7878/ready > /dev/null 2>&1
  1. Make it executable:
chmod +x ~/.claude/hooks/post-response.sh
  1. Configure Claude Code to run the hook (in ~/.claude/settings.json):
{
  "hooks": {
    "postResponse": ["~/.claude/hooks/post-response.sh"]
  }
}

Without the hook: If Claude integration is enabled but no ready signal is received within 2 seconds, the text is copied to clipboard only (not typed). You'll see "Copied (Claude busy)" in the tray notification.

To disable: Set "claude_integration": false in your config for immediate typing without waiting.

Installation

Ubuntu/Debian (PPA) - Recommended

sudo add-apt-repository ppa:bengweeks/turbo-whisper
sudo apt update
sudo apt install turbo-whisper

Arch Linux (AUR) - Recommended

# Using yay
yay -S turbo-whisper

# Using paru
paru -S turbo-whisper

From Source

Ubuntu/Debian
# Install system dependencies
sudo apt install python3-pyaudio portaudio19-dev xdotool xclip

# Clone and install
git clone https://github.com/knowall-ai/turbo-whisper.git
cd turbo-whisper
python3 -m venv .venv
source .venv/bin/activate
pip install -e .
Fedora
sudo dnf install python3-pyaudio portaudio-devel xdotool xclip
git clone https://github.com/knowall-ai/turbo-whisper.git
cd turbo-whisper
python3 -m venv .venv
source .venv/bin/activate
pip install -e .
Arch Linux (manual)
sudo pacman -S python-pyaudio portaudio xdotool xclip
git clone https://github.com/knowall-ai/turbo-whisper.git
cd turbo-whisper
python3 -m venv .venv
source .venv/bin/activate
pip install -e .

macOS

# Install Homebrew dependencies
brew install portaudio

# Clone and install
git clone https://github.com/knowall-ai/turbo-whisper.git
cd turbo-whisper
python3 -m venv .venv
source .venv/bin/activate
pip install -e .

Windows

# Clone the repository
git clone https://github.com/knowall-ai/turbo-whisper.git
cd turbo-whisper

# Create virtual environment
python -m venv .venv
.venv\Scripts\activate

# Install dependencies
pip install -e .
pip install pyperclip  # Required for Windows clipboard/typing

Configuration

Create ~/.config/turbo-whisper/config.json (Linux/macOS) or %APPDATA%\turbo-whisper\config.json (Windows):

{
  "api_url": "https://api.openai.com/v1/audio/transcriptions",
  "api_key": "sk-your-api-key",
  "hotkey": ["ctrl", "shift", "space"],
  "language": "en",
  "auto_paste": true,
  "copy_to_clipboard": true,
  "typing_delay_ms": 5,
  "waveform_color": "#00ff88",
  "background_color": "#1a1a2e"
}

API Endpoints

OpenAI API:

{
  "api_url": "https://api.openai.com/v1/audio/transcriptions",
  "api_key": "sk-your-api-key"
}

Self-hosted faster-whisper-server:

{
  "api_url": "http://your-server:8000/v1/audio/transcriptions",
  "api_key": ""
}

Usage

# Activate virtual environment
source .venv/bin/activate  # Linux/macOS
# or
.venv\Scripts\activate     # Windows

# Start the application
turbo-whisper
  1. Press Ctrl+Shift+Space to start recording
  2. Speak your text
  3. Press Ctrl+Shift+Space again to stop and transcribe
  4. Text is automatically typed into the focused window (wherever your cursor is)

Keyboard Shortcuts

Shortcut Action
Ctrl+Shift+Space Start/stop recording (configurable)
Esc Cancel recording (when window is focused)

Custom Hotkey

Edit your config to change the hotkey:

{
  "hotkey": ["ctrl", "alt", "w"]
}

Available modifiers: ctrl, shift, alt, super

Autostart on Login

To start Turbo Whisper automatically when you log in:

Linux (all distros):

# Create autostart directory if it doesn't exist
mkdir -p ~/.config/autostart

# Copy the desktop file (if installed via AUR/PPA)
cp /usr/share/applications/turbo-whisper.desktop ~/.config/autostart/

# Or create manually
cat > ~/.config/autostart/turbo-whisper.desktop << 'EOF'
[Desktop Entry]
Name=Turbo Whisper
Exec=turbo-whisper
Type=Application
X-GNOME-Autostart-enabled=true
EOF

macOS:

  • Open System Preferences → Users & Groups → Login Items
  • Click + and add Turbo Whisper

Windows:

  • Press Win+R, type shell:startup, press Enter
  • Create a shortcut to turbo-whisper in that folder

Self-Hosting Whisper

You can run your own Whisper server for faster, private, and cost-free transcription using faster-whisper-server.

Hardware Requirements

Model VRAM (GPU) RAM (CPU) Speed Accuracy
tiny ~1 GB ~2 GB Fastest Basic
base ~1 GB ~2 GB Very fast Good
small ~2 GB ~4 GB Fast Better
medium ~5 GB ~8 GB Moderate Great
large-v3 ~10 GB ~16 GB Slower Best

Recommendations:

  • GPU with 6+ GB VRAM: Use large-v3 for best accuracy
  • GPU with 4 GB VRAM: Use small or medium
  • CPU only: Use tiny or base (expect slower transcription)

Quick Start with Docker

# With NVIDIA GPU (recommended)
docker run --gpus=all -p 8000:8000 \
  -e WHISPER__MODEL=Systran/faster-whisper-large-v3 \
  fedirz/faster-whisper-server:latest-cuda

# With smaller model (less VRAM)
docker run --gpus=all -p 8000:8000 \
  -e WHISPER__MODEL=Systran/faster-whisper-small \
  fedirz/faster-whisper-server:latest-cuda

# CPU only (slower, no GPU required)
docker run -p 8000:8000 \
  -e WHISPER__MODEL=Systran/faster-whisper-base \
  fedirz/faster-whisper-server:latest-cpu

Available Models

Models are downloaded automatically on first use:

Model ID Size
Systran/faster-whisper-tiny ~75 MB
Systran/faster-whisper-base ~150 MB
Systran/faster-whisper-small ~500 MB
Systran/faster-whisper-medium ~1.5 GB
Systran/faster-whisper-large-v3 ~3 GB

Persistent Model Cache

To avoid re-downloading models on container restart:

docker run --gpus=all -p 8000:8000 \
  -v ~/.cache/huggingface:/root/.cache/huggingface \
  -e WHISPER__MODEL=Systran/faster-whisper-large-v3 \
  fedirz/faster-whisper-server:latest-cuda

Configure Turbo Whisper

Update your config to use the self-hosted server:

{
  "api_url": "http://localhost:8000/v1/audio/transcriptions",
  "api_key": ""
}

Verify Server is Running

curl http://localhost:8000/health

Documentation

For detailed documentation, see the docs/ directory:

Troubleshooting

Linux: Hotkey conflicts

If Ctrl+Shift+Space conflicts with another application, edit the config:

{
  "hotkey": ["ctrl", "alt", "w"]
}

Windows: PyAudio installation fails

Install the pre-built wheel:

pip install pipwin
pipwin install pyaudio

macOS: Accessibility permissions

Grant accessibility permissions to your terminal app in System Preferences → Security & Privacy → Privacy → Accessibility.

For more troubleshooting tips, see docs/TROUBLESHOOTING.adoc.

License

MIT License - see LICENSE for details.

Keywords

Voice dictation Linux, speech to text, STT, voice typing, transcription, transcribe audio, OpenAI Whisper GUI, dictation software, speech recognition, voice input, hands-free typing, accessibility, SuperWhisper alternative, faster-whisper, voice to text CLI, terminal dictation, free open source, multilingual, 99 languages, RSI, carpal tunnel, real-time transcription, local whisper, offline speech recognition, nerd-dictation alternative, voice coding, voice input terminal, how to dictate on Linux, best voice dictation Linux, Ubuntu voice typing, Arch Linux dictation.

Credits

Inspired by SuperWhisper for macOS.

About

SuperWhisper-like voice dictation for Linux with waveform UI

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Contributors 2

  •  
  •