A Go server that connects Twilio's programmable voice API with ElevenLabs AI agents to create an interactive AI phone system. This server handles both inbound and outbound calls, routing audio between the caller and ElevenLabs' conversational AI.
For more information, check out this article based on this implementation.
Here's a high-level flow of how these components interact:
+----------+ +----------------+ +--------------+
| | Audio | | Audio (WS) | |
| Caller | <-----------> | Twilio Phone | <-----------> | Go |
| | | System | | Server |
+----------+ +----------------+ +--------------+
^
|
| Audio
| (WebSocket)
v
+--------------+
| |
| ElevenLabs |
| AI |
| |
+--------------+
To follow along, you'll need:
- A Twilio account with a phone number
- An ElevenLabs account with an agent ID and API key
- Inbound Call Handling: Process incoming calls to your Twilio number
- Outbound Call Capabilities: Initiate calls from your system through Twilio
- Conversational AI: Connect callers with ElevenLabs AI agents
- Customizable Prompts: Configure different prompts for inbound vs outbound calls
- POST
/incoming-call
: Webhook endpoint for Twilio to send incoming call notifications - GET
/media-stream
: WebSocket endpoint for bidirectional audio streaming - POST
/outbound-call
: Initiate an outbound call - POST
/outbound-call-twiml
: Generate TwiML for outbound calls
ELEVENLABS_API_KEY=your_elevenlabs_api_key
ELEVENLABS_AGENT_ID=your_elevenlabs_agent_id
TWILIO_ACCOUNT_SID=your_twilio_account_sid
TWILIO_AUTH_TOKEN=your_twilio_auth_token
TWILIO_PHONE_NUMBER=your_twilio_phone_number