Skip to content

Modern ARI-STASI server, built on Asterisk ARI with real-time speech-to-text transcription, voice-activated dialing, and WebSocket integration. Powerful telephony control using TypeScript/Node.js.

License

Notifications You must be signed in to change notification settings

alexiokay/VoiceFlow-ts

Repository files navigation

📞 VoiceFlow (ARI Stasi Server)

TypeScript Node.js Asterisk License: MIT

VoiceFlow Logo
Asterisk-powered telephony management with speech recognition and transcription


📋 Overview

TranscriptARI is a sophisticated telephony management system built on Asterisk's ARI (Asterisk REST Interface). It provides voice call handling, transcription, and PBX control capabilities. The system combines WebSockets, RTP, and Google Speech-to-Text integration to create a modern, feature-rich telephony solution.

✨ Key Features

  • 🔄 Call Management - Handles incoming and outgoing calls through Asterisk PBX
  • 🎙️ Speech-to-Text - Real-time transcription of calls using Google Cloud Speech API
  • 🌉 Bridge Management - Creates and manages voice bridges for connecting multiple channels
  • 👥 Contact Recognition - Supports voice-activated dialing using a contacts database
  • 📡 External Media Channels - Supports external media integration for advanced use cases
  • 🔌 WebSocket Interface - Provides real-time updates and control via WebSockets

🏗️ Architecture

The system is built on TypeScript and Node.js with a modular architecture:

Core Components

🎮 AriControllerServer

The main controller that interfaces with Asterisk PBX:

  • Manages call flows, bridges, and DTMF input
  • Handles Stasis application events (start, end)
  • Provides WebSocket server for client connections
  • Manages contact lookups for voice-activated dialing

🔤 AriTranscriberServer

Provides real-time speech transcription:

  • Connects to Google Cloud Speech API
  • Processes RTP audio streams
  • Transmits transcription results via WebSockets
  • Supports customizable language and model settings

📡 RTP UDP Server

Handles the real-time audio streaming:

  • Processes incoming RTP packets from Asterisk
  • Handles audio format conversion for transcription

🗣️ Google Speech Provider

Integration with Google's Speech-to-Text API:

  • Handles streaming transcription with automatic restarts
  • Manages audio chunking for optimal performance
  • Provides both interim and final transcription results

⚙️ Configuration

The system uses environment variables for configuration, including:

Category Variables
PBX PBX IP address, login credentials
WebSocket Server ports, external host information
Transcription Language settings, model configuration
Telephony Provider settings, phone numbers

🚀 Getting Started

Setup

  1. Set up a FreePBX server - FreePBX Server Installation and Configuration Guide
  2. Configure environment variables in .env file (see env.example for reference)
  3. Set up Google Cloud credentials for speech recognition
  4. Configure contacts in tools/contacts.json for voice-activated dialing
  5. Start the system with:
    npm start
    or
    npx ts-node -T core/manager.ts

💡 Use Cases

📞 Voice Call Center

Handle incoming calls with transcription for record-keeping

🤖 Automated Calling Systems

Set up outbound call campaigns with speech recognition

🗣️ Voice-Activated Dialing

Allow callers to speak names instead of dialing numbers

📝 Call Recording with Transcription

Keep searchable records of call content

📦 Dependencies

  • Asterisk PBX with ARI enabled
  • Node.js and TypeScript
  • Google Cloud Speech API credentials
  • Various NPM packages including:
    ari-client, @google-cloud/speech, ws, express, dotenv
    

🔮 Future Improvements

  • 🔧 Enhanced typing for typescript
  • 🖥️ WEB UI for for monitoring and management
  • 🔊 Additional speech recognition providers like self hosted whisper model or Scribe from ElevenLabs
  • 📊 Call analytics and reporting features

📜 License

This project is licensed under the MIT License - see the LICENSE file for details.

The MIT License is a permissive license that allows anyone to:

  • Use the software for any purpose
  • Change the software to suit your needs
  • Share the software with anyone
  • Sell the software or build commercial software with it

The only requirement is to include the original copyright notice and license in any copy of the software/source.


Built with ❤️ for modern telephony solutions

About

Modern ARI-STASI server, built on Asterisk ARI with real-time speech-to-text transcription, voice-activated dialing, and WebSocket integration. Powerful telephony control using TypeScript/Node.js.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published