📞 AI Voice Caller Agent (Raspberry Pi + SIM7600)

A powerful, open-source Voice AI Agent capable of making and receiving phone calls, conversing naturally in Hinglish (Hindi + English), and booking meetings automatically. Powered by Google Gemini Live API and running on a Raspberry Pi 4B with a SIM7600G-H 4G GSM Module.

📺 Live Demo

I will be posting a full video demonstration of this agent in action on LinkedIn. Stay tuned here: Watch the Demo on LinkedIn (Link will be updated once the post is live!)

💡 Project Motivation

This project was born out of a simple need: Helping small business owners save time. Business owners often miss critical calls while working. This AI system acts as a 24/7 assistant that can:

Automated Reception: Take calls and book appointments when you are unavailable.
Lead Capture: Automatically gather potential client details so no inquiry is missed.
Customer Support: Answer frequent questions (FAQ) about your services or business hours.
After-Hours Assistant: Handle inquiries late at night so you can rest.

This system ensures that even if you can't pick up the phone, your business is still talking to your clients.

🌟 Features

Real-time Voice Conversation: Uses Google's Gemini Live (Multimodal Live API) for low-latency, natural voice interactions.
Hinglish Support: Specifically prompted to speak in a natural, conversational Indian context.
Auto-Meeting Booking: Detects user intent during calls and automatically saves meeting details.
Interruption Handling (Barge-in): The AI is smart—if you start speaking while it's talking, it immediately stops and listens to you, making the conversation feel human-like.

🚀 Future Extensions & Possibilities

This project is designed to be highly extensible. Here are some ways you can scale and upgrade this system:

1. The "Caller Farm" (Multi-Line AI Call Center)

While a single Raspberry Pi + SIM7600 handles one concurrent call, you can scale this into a multi-agent call center:

Hardware Scaling: Use a GoIP GSM Gateway (4, 8, 16, or 32 SIM slots) connected to a local SIP server (Asterisk/FreePBX) on the Pi.
Cloud Scaling: Replace the SIM module with a VoIP provider like Twilio or Plivo to handle dozens of concurrent Gemini Live streams in the cloud.

2. Agentic AI Integration (OpenClaw / NanoClaw)

You can upgrade the AI from a simple voice bot to a fully autonomous AI employee using lightweight agent frameworks like NanoClaw or PicoClaw:

Live Tool Calling: Give the AI Python functions to execute during the call. For example, checking a live Google Calendar, querying a database, or triggering the SIM7600 to send an SMS (AT+CMGS) while talking to the user.
Long-Term Memory: The agent can remember past callers. If a customer calls back 3 hours later, the AI can say, "Hello Rakesh, did you get the pricing email I sent earlier?"
Call Transfers: Program the AI to send DTMF tones or AT commands to transfer complex calls to your personal number.

🏗️ System Workflow

🛠️ Hardware Requirements

To build this project, you will need the following components:

Raspberry Pi 4B Kit:
- Recommended: 4GB or 8GB RAM version.
- CRITICAL: Use the Original Raspberry Pi Power Supply. Using a generic phone charger will cause voltage drops and unstable behavior with the GSM module (which has high power spikes).
SIM7600G-H 4G GSM Module:
- CRITICAL: You MUST use this specific module (because it support audio stream via USB).
- Reason: It is one of the few modules that supports direct PCM audio streaming from the Raspberry Pi via USB (no external sound cards or complex wiring required).
- Supports 4G/3G/2G communication and GNSS positioning.
SIM Card:
- Preferred: Airtel (Recommended for better 3G/VoLTE fallback support).
- Note: I personally experienced compatibility issues with Jio SIMs in this setup, so they are not recommended.
USB Cable:
- A high-quality USB Type-C cable to connect the GSM Module's data port to the Raspberry Pi's USB port.

📦 Software Setup

1. Initial Pi & SIMCOM Setup

Install Raspberry Pi OS: Flash the standard Raspberry Pi OS to your SD card.
Connect Network: Ensure your Pi is connected to WiFi or Ethernet.
Install SIMCOM Config: You need to set up the SIM7600 module drivers/configuration ("Simcom").
- Reference Docs: Waveshare SIM7600G-H 4G HAT (B) Wiki
- Follow the wiki to ensure the module is recognized and the USB audio drivers are working.

2. System Dependencies

Ensure your Raspberry Pi OS is up to date and you have Python 3.11+ installed.

sudo apt update && sudo apt upgrade
sudo apt install python3-pip python3-venv portaudio19-dev

3. Install Python Libraries

Create a virtual environment (optional but recommended) and install the required Python libraries:

pip install pyserial numpy google-genai --break-system-packages

Note: On newer Raspberry Pi OS versions (Bookworm+), you may need the --break-system-packages flag if you are not using a virtual environment.

Note: You may need to create a Google Cloud Project and enable the Gemini API.

4. Configuration

Open the script (e.g., main.py).
API Key: Replace the placeholder GEMINI_API_KEY with your actual Google Gemini API Key.
- ⚡ Security Tip: Never commit your API key to GitHub! It's best to use an environment variable or a .env file to keep your keys safe.
Model Selection:
- Currently set to MODEL_NAME = "models/gemini-2.5-flash-native-audio-preview-12-2025".
- Future Updates: If the code stops working, this model name might have changed. Visit Google AI Studio, select the Multimodal Live feature, click "Get Code", and find the string MODEL_NAME = "...". Copy and paste that value into the script.
- Tier: This project works perfectly fine on the Free Version of the Gemini API Key.

🚀 Usage

Connect the SIM7600 module to the Raspberry Pi via USB.
Power on the Pi and ensure the SIM card has network connectivity (Status LED blinking slowly).
Run the agent:

python 26jan_imporvemnet.py

Call the number associated with the SIM card.
The AI will answer via the script:
- "Hello, main Satyam ki AI assistant baat kar rahi hoon..."
Book a Meeting:
- Say: "Mujhe meeting book karni hai" (I want to book a meeting).
- Provide your Name, Reason, and Phone Number.
- The script will detect the details and save them to bookings.json.

📂 Project Structure

.
├── 26jan_imporvemnet.py    # Core logic (AT commands, Audio handling, Gemini API)
├── bookings.json       # Auto-generated file storing meeting requests
└── README.md           # This documentation

⚠️ Troubleshooting

ConnectionClosedError: If you see this from the Gemini API, ensure you are using the correct response_modalities. The Preview model currently works best with ["AUDIO"].
Low Voltage Warning: If your Pi throttles or the GSM module resets, CHECK YOUR POWER SUPPLY. The SIM7600 draws over 2A during calls.
Audio Noise: Ensure the USB cable is high quality and not too long.

📊 Data Format (`bookings.json`)

The system automatically captures details in a structured JSON file:

[
  {
    "timestamp": "2026-01-26T20:46:01",
    "name": "Rakesh Sharma",
    "reason": "Website Project Inquiry",
    "phone": "9876543210",
    "raw_text": "|| BOOK_MEETING || Name: Rakesh Sharma, Reason: Website Project, Phone: 9876543210"
  }
]

💰 Estimated Budget

Building this kit costs approximately:

INR: ₹12,000 - ₹14,000
USD: $145 - $170 (approx.)

Note: Prices may vary based on your location and vendor choice.

🤝 Contributing

Open to contributions! Feel free to submit Pull Requests for better prompt engineering, more robust error handling, or support for other GSM modules.

📬 Contact

Feel free to reach out for collaborations or queries:

Email: [email protected]
LinkedIn: SatyamDevv
X (Twitter): @SatyamDevv

📄 License & Disclaimer

Disclaimer: This project is for educational purposes only. The creators do not promote any unethical behavior and take no responsibility for how this tool is used. Please comply with your local telecommunication laws regarding automated calling and recordings.

Open Source. Feel free to use and modify.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

📞 AI Voice Caller Agent (Raspberry Pi + SIM7600)

📺 Live Demo

💡 Project Motivation

🌟 Features

🚀 Future Extensions & Possibilities

1. The "Caller Farm" (Multi-Line AI Call Center)

2. Agentic AI Integration (OpenClaw / NanoClaw)

🏗️ System Workflow

🛠️ Hardware Requirements

📦 Software Setup

1. Initial Pi & SIMCOM Setup

2. System Dependencies

3. Install Python Libraries

4. Configuration

🚀 Usage

📂 Project Structure

⚠️ Troubleshooting

📊 Data Format (`bookings.json`)

💰 Estimated Budget

🤝 Contributing

📬 Contact

📄 License & Disclaimer

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
Images		Images
README.md		README.md
main.py		main.py

Folders and files

Latest commit

History

Repository files navigation

📞 AI Voice Caller Agent (Raspberry Pi + SIM7600)

📺 Live Demo

💡 Project Motivation

🌟 Features

🚀 Future Extensions & Possibilities

1. The "Caller Farm" (Multi-Line AI Call Center)

2. Agentic AI Integration (OpenClaw / NanoClaw)

🏗️ System Workflow

🛠️ Hardware Requirements

📦 Software Setup

1. Initial Pi & SIMCOM Setup

2. System Dependencies

3. Install Python Libraries

4. Configuration

🚀 Usage

📂 Project Structure

⚠️ Troubleshooting

📊 Data Format (bookings.json)

💰 Estimated Budget

🤝 Contributing

📬 Contact

📄 License & Disclaimer

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

📊 Data Format (`bookings.json`)

Packages