Add speech-to-text support across providers that offer STT endpoints.
Phase 1 — URL-based STT (no file upload):
- AssemblyAI (polling pattern)
- Groq, Together, Fireworks, Mistral (sync/multipart)
Requires:
multipart/form-data support in proxy.py (currently only handles JSON bodies)
- New
url UI param type for audio URL input
Phase 2 — File upload:
- OpenAI Whisper, DeepInfra, SambaNova
- Needs a file upload UI component and binary body handling
A design plan exists in docs/plans/2026-02-23-speech-to-text-plan.md.
Add speech-to-text support across providers that offer STT endpoints.
Phase 1 — URL-based STT (no file upload):
Requires:
multipart/form-datasupport inproxy.py(currently only handles JSON bodies)urlUI param type for audio URL inputPhase 2 — File upload:
A design plan exists in
docs/plans/2026-02-23-speech-to-text-plan.md.