Skip to content

Add speech-to-text definitions for 8 providers #2

@ajot

Description

@ajot

Add speech-to-text support across providers that offer STT endpoints.

Phase 1 — URL-based STT (no file upload):

  • AssemblyAI (polling pattern)
  • Groq, Together, Fireworks, Mistral (sync/multipart)

Requires:

  • multipart/form-data support in proxy.py (currently only handles JSON bodies)
  • New url UI param type for audio URL input

Phase 2 — File upload:

  • OpenAI Whisper, DeepInfra, SambaNova
  • Needs a file upload UI component and binary body handling

A design plan exists in docs/plans/2026-02-23-speech-to-text-plan.md.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions