Real-Time Full-Duplex Voice Assistant

Low-latency, interruptible, full-duplex (talk & listen at the same time) voice assistant with a web UI, streaming ASR, TTS, and LLM orchestration. Built for real conversations, barge-in, and hands-free control.

✨ Features

Full-duplex audio: talk and listen simultaneously (barge-in / interruption supported).
Streaming ASR: incremental transcripts while you speak.
Streaming TTS: assistant responds with audio before text finishes.
LLM orchestration: tool use/function calls and stateful dialog.
Web UI: mic capture, waveforms, and live captions in-browser.
Production-ready stack: Traefik reverse proxy + auto TLS, Nginx static hosting, FastAPI backend.
Single command up: deploy with docker compose up -d.

🧭 Architecture

Application Flow

Browser (Web UI)
├─ Mic capture (WebAudio) → WebSocket → Assistant (FastAPI)
│ │
│ partial transcripts
│ ▼
├─ Live captions ← ASR (streaming via Assistant)
│ │
│ ▼
├─ TTS audio playback ← TTS (streaming chunks)
│ ▲
│ │
└─ Controls/Events → LLM Orchestrator

🐋 Docker Stack & Routing

           ┌───────────────────────────┐
           │        Internet            │
           └────────────┬──────────────┘
                        │  :80 / :443
                        ▼
               ┌─────────────────┐
               │     Traefik     │
               │ (Reverse Proxy) │
               └───────┬─────────┘
         ┌─────────────┼─────────────┐
         │             │             │
┌────────▼───┐   ┌─────▼─────┐   ┌──▼────────┐
│   /        │   │   /api    │   │   /ws     │
│   Web UI   │   │  Assistant│   │ Assistant │
│ (Nginx)    │   │ (FastAPI) │   │ (FastAPI) │
└────────────┘   └───────────┘   └───────────┘

Services in this repo

traefik: reverse proxy, automatic HTTPS via Let’s Encrypt.
web: static frontend (served by Nginx).
assistant: FastAPI backend (ASR, TTS, LLM orchestration, WebSockets).
init_letsencrypt: bootstrap storage for ACME certificates.

🚀 Quick Start

1. Prerequisites

Docker & Docker Compose
Domain pointing to your server: com-cloud.cloud
DNS A/AAAA records configured
API keys for ASR, TTS, and LLM providers

2. Configure Environment

Create `src/assistant/.env` with your secrets:

# LLM / Orchestrator
LLM_PROVIDER=openai
OPENAI_API_KEY=sk-...

# ASR
ASR_PROVIDER=openai_realtime
ASR_API_KEY=...

# TTS
TTS_PROVIDER=openai_realtime
TTS_API_KEY=...

# CORS / ORIGINS
ALLOWED_ORIGINS=https://com-cloud.cloud

# Optional
LOG_LEVEL=info

3. 🖥️ Local Development

Run backend directly:

cd src/assistant
python -m venv .venv && source .venv/bin/activate
pip install -r requirements.txt
uvicorn assistant.app:app --reload --host 0.0.0.0 --port 8000

Frontend

cd web
npm install
npm run dev

🎙️ Using the Assistant

Open https://com-cloud.cloud

Click on ORB to Connect to establish WebSocket session.

Speak naturally; interrupt the assistant mid-sentence.

Watch live captions, hear real-time TTS playback.

DONT FOTGET TO CLOSE THE TAB!!!

⚙️ Configuration

Key options:

ASR: model, language hints, VAD sensitivity.

TTS: voice, speed, sample rate.

LLM: model, temperature, tool schemas.

Traefik: TLS challenge type, timeouts, rate limits.

🔌 API

GET /healthz – service health

WS /ws/asr – audio in ↔ transcript out

WS /ws/assistant – dialog orchestration (events + responses)

WS /ws/tts – text in ↔ audio out

POST /api/tools/<name> – trigger server-side tool functions

🔐 Security

HTTPS enforced (TLS via Let’s Encrypt + Traefik).

Strict CORS (limited to https://com-cloud.cloud).

API rate limiting enabled (/api).

Secrets kept in .env (not in frontend).

📦 Deployment Notes

Reverse proxy: Traefik v3 with ACME TLS challenge.

Certificates stored in ./letsencrypt/acme.json.

Static frontend served by Nginx (web service).

Backend served via assistant (FastAPI) behind Traefik.

Scale with Docker Swarm / k8s if needed.

🗺️ Roadmap

 Wake-word hotword detection

 Speaker diarization

 Plug-and-play tool registry

 Persistent transcripts

 Multi-voice TTS

🤝 Contributing

Fork this repo

Create a feature branch

Submit PR with screenshots/logs if UI/backend affected

Name		Name	Last commit message	Last commit date
Latest commit History 475 Commits
.github/workflows		.github/workflows
docs		docs
media		media
monitoring		monitoring
nginx/conf.d		nginx/conf.d
sdk-js		sdk-js
src		src
tests/performance		tests/performance
web		web
.gitattributes		.gitattributes
.gitignore		.gitignore
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
GITHUB_DEPLOYMENT_GUIDE.md		GITHUB_DEPLOYMENT_GUIDE.md
IMPLEMENTATION_GUIDE.md		IMPLEMENTATION_GUIDE.md
LATENCY_INTEGRATION.md		LATENCY_INTEGRATION.md
LATENCY_MEASUREMENT_README.md		LATENCY_MEASUREMENT_README.md
LICENSE		LICENSE
README.md		README.md
analyze_latency.py		analyze_latency.py
docker-compose.production.yml		docker-compose.production.yml
docker-compose.yml		docker-compose.yml
prometheus.yml		prometheus.yml
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Real-Time Full-Duplex Voice Assistant

✨ Features

🧭 Architecture

Application Flow

🐋 Docker Stack & Routing

Services in this repo

🚀 Quick Start

1. Prerequisites

2. Configure Environment

3. 🖥️ Local Development

Run backend directly:

Frontend

🎙️ Using the Assistant

⚙️ Configuration

🔌 API

🔐 Security

📦 Deployment Notes

🗺️ Roadmap

🤝 Contributing

About

Uh oh!

Releases 3

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Real-Time Full-Duplex Voice Assistant

✨ Features

🧭 Architecture

Application Flow

🐋 Docker Stack & Routing

Services in this repo

🚀 Quick Start

1. Prerequisites

2. Configure Environment

3. 🖥️ Local Development

Run backend directly:

Frontend

🎙️ Using the Assistant

⚙️ Configuration

🔌 API

🔐 Security

📦 Deployment Notes

🗺️ Roadmap

🤝 Contributing

About

Resources

License

Code of conduct

Uh oh!

Stars

Watchers

Forks

Releases 3

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages