Analysis updated 2026-07-03
Build an AI phone bot that picks up inbound calls and holds a natural spoken conversation with callers
Create a voice assistant that listens through your microphone and replies using a language model of your choice
Trigger outbound phone calls from code and have an AI agent conduct the conversation automatically
| vocodedev/vocode-core | flasgger/flasgger | mrforexample/comfyui-3d-pack | |
|---|---|---|---|
| Stars | 3,741 | 3,742 | 3,742 |
| Language | Python | Python | Python |
| Setup difficulty | moderate | easy | moderate |
| Complexity | 3/5 | 2/5 | 3/5 |
| Audience | developer | developer | designer |
Figures from each repo's GitHub metadata at analysis time.
Requires API keys for at least one transcription provider, one language model, and one voice synthesis provider.
Vocode is an open source Python library for building voice-based AI agents that can hold real-time spoken conversations. You give it a microphone and speaker, connect it to a language model and a speech service, and it handles the real-time loop of listening, understanding, responding, and speaking, all in a streaming fashion so the conversation feels natural rather than clunky. The library is designed around three interchangeable pieces: a transcription service that converts the incoming audio to text, a language model that generates the reply, and a text-to-speech service that speaks the response. Each piece has multiple provider options. For transcription you can choose from Deepgram, AssemblyAI, Whisper, Google Cloud, Azure, and others. For language models the built-in options include OpenAI and Anthropic. For voice synthesis there are providers like ElevenLabs, Play.ht, Azure, Google Cloud, AWS Polly, and more. Swapping one provider for another is a configuration change, not a code rewrite. Beyond simple microphone conversations, Vocode can connect to phone calls. You can set up an inbound phone number that an AI agent picks up and talks to callers, or trigger outbound calls from code. It also supports dialing into Zoom meetings. The README includes a working Python code example showing how to wire together a streaming conversation using Deepgram for transcription, ChatGPT as the agent, and Azure for synthesis, using just a handful of imports. Installation is a single pip command. Full documentation lives at docs.vocode.dev, and there is a Discord community for contributors and users. The project is actively looking for community maintainers and describes itself as very open to contributions.
Vocode is a Python library for building AI voice agents that hold real-time spoken conversations, with swappable providers for speech-to-text, language models, and text-to-speech.
Mainly Python. The stack also includes Python, OpenAI, Anthropic.
License information is not mentioned in the description.
Setup difficulty is rated moderate, with roughly 30min to a first successful run.
Mainly developer.
This repo across BitVibe Labs
Verify against the repo before relying on details.