
TranscriptARI is a sophisticated telephony management system built on Asterisk's ARI (Asterisk REST Interface). It provides voice call handling, transcription, and PBX control capabilities. The system combines WebSockets, RTP, and Google Speech-to-Text integration to create a modern, feature-rich telephony solution.
- 🔄 Call Management - Handles incoming and outgoing calls through Asterisk PBX
- 🎙️ Speech-to-Text - Real-time transcription of calls using Google Cloud Speech API
- 🌉 Bridge Management - Creates and manages voice bridges for connecting multiple channels
- 👥 Contact Recognition - Supports voice-activated dialing using a contacts database
- 📡 External Media Channels - Supports external media integration for advanced use cases
- 🔌 WebSocket Interface - Provides real-time updates and control via WebSockets
The system is built on TypeScript and Node.js with a modular architecture:
🎮 AriControllerServer
The main controller that interfaces with Asterisk PBX:
- Manages call flows, bridges, and DTMF input
- Handles Stasis application events (start, end)
- Provides WebSocket server for client connections
- Manages contact lookups for voice-activated dialing
🔤 AriTranscriberServer
Provides real-time speech transcription:
- Connects to Google Cloud Speech API
- Processes RTP audio streams
- Transmits transcription results via WebSockets
- Supports customizable language and model settings
📡 RTP UDP Server
Handles the real-time audio streaming:
- Processes incoming RTP packets from Asterisk
- Handles audio format conversion for transcription
🗣️ Google Speech Provider
Integration with Google's Speech-to-Text API:
- Handles streaming transcription with automatic restarts
- Manages audio chunking for optimal performance
- Provides both interim and final transcription results
The system uses environment variables for configuration, including:
| Category | Variables |
|---|---|
| PBX | PBX IP address, login credentials |
| WebSocket | Server ports, external host information |
| Transcription | Language settings, model configuration |
| Telephony | Provider settings, phone numbers |
- Set up a FreePBX server - FreePBX Server Installation and Configuration Guide
- Configure environment variables in
.envfile (seeenv.examplefor reference) - Set up Google Cloud credentials for speech recognition
- Configure contacts in
tools/contacts.jsonfor voice-activated dialing - Start the system with:
or
npm start
npx ts-node -T core/manager.ts
- Asterisk PBX with ARI enabled
- Node.js and TypeScript
- Google Cloud Speech API credentials
- Various NPM packages including:
ari-client, @google-cloud/speech, ws, express, dotenv
- 🔧 Enhanced typing for typescript
- 🖥️ WEB UI for for monitoring and management
- 🔊 Additional speech recognition providers like self hosted whisper model or Scribe from ElevenLabs
- 📊 Call analytics and reporting features
This project is licensed under the MIT License - see the LICENSE file for details.
The MIT License is a permissive license that allows anyone to:
- Use the software for any purpose
- Change the software to suit your needs
- Share the software with anyone
- Sell the software or build commercial software with it
The only requirement is to include the original copyright notice and license in any copy of the software/source.
Built with ❤️ for modern telephony solutions