The most powerful, flexible open-source AI voice agent for Asterisk/FreePBX. Featuring a modular pipeline architecture that lets you mix and match STT, LLM, and TTS providers, plus 3 production-ready golden baselines validated for enterprise deployment.
- ποΈ Modular Pipeline Architecture: Mix and match STT, LLM, and TTS providers independently
 - β 3 Golden Baselines: Production-validated configurations ready to deploy
 - π Privacy-Focused Options: Local Hybrid keeps audio processing on-premises
 - β‘ Simplified Setup: 3 clear configuration choices (down from 6)
 - π Enterprise Monitoring: Prometheus + Grafana out of the box
 - π ExternalMedia RTP: Modern, reliable audio transport for pipelines
 - π― Validated Performance: Every config tested in production
 
- Asterisk-Native: Works directly with your existing Asterisk/FreePBX - no external telephony providers required
 - Truly Open Source: MIT licensed with complete transparency and control
 - Modular Architecture: Choose cloud, local, or hybrid - mix providers as needed
 - Production-Ready: Battle-tested with validated configurations and enterprise monitoring
 - Cost-Effective: Local Hybrid costs ~$0.001-0.003/minute (LLM only)
 - Privacy-First: Keep audio local while using cloud intelligence
 
- 
OpenAI Realtime (Recommended for Quick Start)
- Modern cloud AI with natural conversations
 - Response time: <2 seconds
 - Best for: Enterprise deployments, quick setup
 
 - 
Deepgram Voice Agent (Enterprise Cloud)
- Advanced Think stage for complex reasoning
 - Response time: <3 seconds
 - Best for: Deepgram ecosystem, advanced features
 
 - 
Local Hybrid (Privacy-Focused)
- Local STT/TTS + Cloud LLM (OpenAI)
 - Audio stays on-premises, only text to cloud
 - Response time: 3-7 seconds
 - Best for: Audio privacy, cost control, compliance
 
 
- Modular Pipeline System: Independent STT, LLM, and TTS provider selection
 - Dual Transport Support: AudioSocket (legacy) and ExternalMedia RTP (modern)
 - High-Performance Architecture: Separate 
ai-engineandlocal-ai-servercontainers - Enterprise Monitoring: Prometheus + Grafana with 5 dashboards and 50+ metrics
 - State Management: SessionStore for centralized, typed call state
 - Barge-In Support: Interrupt handling with configurable gating
 - Docker Deployment: Simple two-service orchestration
 - Customizable: YAML configuration for greetings, personas, and behavior
 
Experience all three golden baseline configurations with a single phone call:
Dial: (925) 736-6718
- Press 6 β Deepgram Voice Agent (Enterprise cloud with Think stage)
 - Press 7 β OpenAI Realtime API (Modern cloud AI, most natural)
 - Press 8 β Local Hybrid Pipeline (Privacy-focused, audio stays local)
 
Each configuration uses the same Ava persona with full project knowledge. Compare response times, conversation quality, and naturalness!
Demo Monitoring Dashboards: https://demo.jugaar.llc User:Pass = demo/demo
Get up and running in 5 minutes:
git clone https://github.com/hkjarral/Asterisk-AI-Voice-Agent.git
cd Asterisk-AI-Voice-Agent
./install.shThe installer will:
- Guide you through 3 simple configuration choices
 - Prompt for required API keys (only what you need)
 - Set up Docker containers automatically
 - Configure Asterisk integration
 
When prompted, select one of the 3 golden baselines:
- [1] OpenAI Realtime - Fastest setup, modern AI (requires 
OPENAI_API_KEY) - [2] Deepgram Voice Agent - Enterprise features (requires 
DEEPGRAM_API_KEY+OPENAI_API_KEY) - [3] Local Hybrid - Privacy-focused (requires 
OPENAI_API_KEY, 8GB+ RAM) 
The installer automatically starts the correct services for your choice.
Add this to your FreePBX (Config Edit β extensions_custom.conf):
[from-ai-agent]
exten => s,1,NoOp(Asterisk AI Voice Agent v4.0)
 same => n,Stasis(asterisk-ai-voice-agent)
 same => n,Hangup()
That's it! Without any variables, the system uses local_hybrid by default.
Then create a Custom Destination pointing to from-ai-agent,s,1 and route calls to it.
Make a call to your configured destination and have a conversation!
Verify health (optional):
curl http://127.0.0.1:15000/healthView logs:
docker compose logs -f ai-engineThat's it! Your AI voice agent is ready. π
For detailed setup, see docs/FreePBX-Integration-Guide.md
config/ai-agent.yaml- Golden baseline configs (safe to commit).env- Secrets and API keys (git-ignored)
The installer handles everything automatically. To customize:
Change greeting or persona:
Edit config/ai-agent.yaml:
llm:
  initial_greeting: "Your custom greeting"
  prompt: "Your custom AI persona"Add/change API keys:
Edit .env:
OPENAI_API_KEY=sk-your-key-here
DEEPGRAM_API_KEY=your-key-here
ASTERISK_ARI_USERNAME=asterisk
ASTERISK_ARI_PASSWORD=your-passwordSwitch configurations:
# Copy a different golden baseline
cp config/ai-agent.golden-deepgram.yaml config/ai-agent.yaml
docker compose up -d --force-recreate ai-engineIf you enabled monitoring during installation, you have Prometheus + Grafana running:
Access Grafana:
http://your-server-ip:3000
Username: admin
Password: admin (change after first login)
If you didn't enable monitoring during install, you can start it anytime:
docker compose -f docker-compose.monitoring.yml up -dStop monitoring:
docker compose -f docker-compose.monitoring.yml downNote: Monitoring is completely optional. The AI agent works without it. See monitoring/README.md for dashboards, alerts, and metrics.
For advanced tuning, see:
- docs/Configuration-Reference.md - Complete reference
 - docs/Transport-Mode-Compatibility.md - Transport modes
 
Two-container architecture for performance and scalability:
ai-engine (Lightweight orchestrator)
- Connects to Asterisk via ARI
 - Manages call lifecycle
 - Routes audio to/from AI providers
 - Handles state management
 
local-ai-server (Optional, for Local Hybrid)
- Runs local STT/TTS models
 - Vosk (speech-to-text)
 - Piper (text-to-speech)
 - WebSocket interface
 
βββββββββββββββββββ      βββββββββββββ      βββββββββββββββββββββ
β Asterisk Server βββββββΆβ ai-engine βββββββΆβ AI Provider       β
β (ARI, RTP)      β      β (Docker)  β      β (Cloud or Local)  β
βββββββββββββββββββ      βββββββββββββ      βββββββββββββββββββββ
                           β     β²
                           β WS  β (Local Hybrid only)
                           βΌ     β
                         βββββββββββββββββββ
                         β local-ai-server β
                         β (Docker)        β
                         βββββββββββββββββββ
Key Design Principles:
- Separation of concerns - AI processing isolated from call handling
 - Modular pipelines - Mix and match STT, LLM, TTS providers
 - Transport flexibility - AudioSocket (legacy) or ExternalMedia RTP (modern)
 - Enterprise-ready - Monitoring, observability, production-hardened
 
For Cloud Configurations (OpenAI Realtime, Deepgram):
- CPU: 2+ cores
 - RAM: 4GB
 - Disk: 1GB
 - Network: Stable internet connection
 
For Local Hybrid (Local STT/TTS + Cloud LLM):
- CPU: 4+ cores (modern 2020+)
 - RAM: 8GB+ recommended
 - Disk: 2GB (models + workspace)
 - Network: Stable internet for LLM API
 
- Docker + Docker Compose
 - Asterisk 18+ with ARI enabled
 - FreePBX (recommended) or vanilla Asterisk
 
| Configuration | Required Keys | 
|---|---|
| OpenAI Realtime | OPENAI_API_KEY | 
| Deepgram Voice Agent | DEEPGRAM_API_KEY + OPENAI_API_KEY | 
| Local Hybrid | OPENAI_API_KEY | 
- FreePBX Integration Guide - Complete setup with dialplan examples
 - Installation Guide - Detailed installation and deployment
 
- Configuration Reference - All YAML settings explained
 - Transport Compatibility - AudioSocket vs ExternalMedia RTP
 - Tuning Recipes - Performance optimization guide
 
- Monitoring Guide - Prometheus + Grafana dashboards (coming soon)
 - Production Deployment - Production best practices (coming soon)
 - Hardware Requirements - System specs and sizing (coming soon)
 
- Architecture - System design and components
 - Contributing - How to contribute
 - Changelog - Release history and changes
 
Contributions are welcome! Please see our Contributing Guide for more details on how to get involved.
Have questions or want to chat with other users? Join our community:
This project is licensed under the MIT License. See the LICENSE file for details.
If you find this project useful, please give it a βοΈ on GitHub! It helps us gain visibility and encourages more people to contribute.
