Summary
The current Claude Code integration has several limitations that need to be addressed for a production-ready experience.
Current Limitations
1. No "busy" signal
- Only
POST /ready exists to signal Claude is ready for input
- No
POST /busy signal when Claude starts processing
- The ready signal expires after 30 seconds, which is imprecise
2. No message queueing
- Multiple dictations while Claude is busy are NOT accumulated
- Each transcription times out independently
- Only the last clipboard copy survives
3. No active window detection
- Integration doesn't check if user is actually in a Claude Code window
- Ready signal applies globally, not per-session
4. No multi-terminal/tab support
- Users may have multiple terminals (Konsole, Windows Terminal, Ghostty, etc.)
- Multiple tabs/splits within each terminal
- Different Claude Code sessions in different tabs
- No way to track which specific session dictation was intended for
Proposed Solution
Phase 1: Basic improvements
Phase 2: Window tracking
Phase 3: Session-aware integration
Technical Considerations
Window identification
- X11:
xdotool getactivewindow returns window ID
- Wayland: More complex - may need compositor-specific solutions (KDE, GNOME, wlroots)
- Windows:
GetForegroundWindow() API
- macOS: Accessibility APIs
Terminal detection
- Detecting if active window is a terminal: check window class/name
- Detecting if Claude Code is running in that terminal: process tree inspection
- Multiple Claude instances: need unique session identifiers
Hook communication
Current hook just sends:
curl -X POST http://localhost:7878/ready
Enhanced hook could send:
curl -X POST http://localhost:7878/ready \
-H "Content-Type: application/json" \
-d '{"session_id": "$CLAUDE_SESSION_ID", "window_id": "$WINDOWID"}'
Alternatives Considered
-
Clipboard-only mode: Don't auto-type, just copy to clipboard. User manually pastes. Simpler but worse UX.
-
Named pipe per session: Each Claude session creates a named pipe. Turbo Whisper writes to the pipe for the active window. More complex but robust.
-
D-Bus integration: Use D-Bus for Linux desktop integration. Would work well with GNOME/KDE but not portable.
Related
- Ghostty may freeze when receiving long text via evdev - investigate if this is a Ghostty bug or Turbo Whisper issue
Summary
The current Claude Code integration has several limitations that need to be addressed for a production-ready experience.
Current Limitations
1. No "busy" signal
POST /readyexists to signal Claude is ready for inputPOST /busysignal when Claude starts processing2. No message queueing
3. No active window detection
4. No multi-terminal/tab support
Proposed Solution
Phase 1: Basic improvements
POST /busyendpoint to mark Claude as processing/readyreceivedPhase 2: Window tracking
xdotoolon X11, platform-specific for Wayland)Phase 3: Session-aware integration
POST /readyincludes session ID parameterTechnical Considerations
Window identification
xdotool getactivewindowreturns window IDGetForegroundWindow()APITerminal detection
Hook communication
Current hook just sends:
Enhanced hook could send:
Alternatives Considered
Clipboard-only mode: Don't auto-type, just copy to clipboard. User manually pastes. Simpler but worse UX.
Named pipe per session: Each Claude session creates a named pipe. Turbo Whisper writes to the pipe for the active window. More complex but robust.
D-Bus integration: Use D-Bus for Linux desktop integration. Would work well with GNOME/KDE but not portable.
Related