explaingit

gradio-app/fastrtc

4,584JavaScriptAudience · developerComplexity · 3/5Setup · moderate

TLDR

A Python library that turns any function into a live audio or video stream, you write the processing logic, FastRTC handles all the real-time connection plumbing so a browser or phone can connect instantly.

Mindmap

mindmap
  root((FastRTC))
    Transport options
      WebRTC low latency
      WebSockets simpler
    Audio features
      Voice activity detection
      Text to speech optional
    Deployment
      Gradio test UI
      FastAPI mount
      Phone number Hugging Face
    Demo apps
      Voice chat AI
      Live transcription
      Webcam detection
      Voice code editor
Click or tap to explore — scroll the page freely

Code map

Detail Auto

An interactive map of this repo's files and how they connect — its source is parsed live in your browser. Click Visualize to build it.

filefunction / class

Things people build with this

USE CASE 1

Build a real-time voice chatbot that detects when the user stops speaking, sends the audio to an AI model, and streams the response back as speech.

USE CASE 2

Add live webcam object detection to a FastAPI app by wrapping your detection function with FastRTC.

USE CASE 3

Give users a phone number to call your Python AI voice assistant over a regular phone line via Hugging Face.

USE CASE 4

Mount FastRTC onto an existing FastAPI server to add real-time audio without rewriting your app.

Tech stack

PythonWebRTCWebSocketsFastAPIGradio

Getting it running

Difficulty · moderate Time to first run · 30min

Phone number feature requires a Hugging Face account, production WebRTC deployments require HTTPS.

No license information is stated in the explanation.

In plain English

FastRTC is a Python library that lets you add real-time audio and video streaming to an application with very little code. The core idea is that you write a regular Python function that processes audio or video, and FastRTC handles all the plumbing to turn that function into a live stream that a browser or phone can connect to. The library supports two transport protocols: WebRTC and WebSockets. WebRTC is the standard technology that browsers use for video calls and is designed for low-latency, peer-to-peer media. WebSockets are a simpler alternative for cases where WebRTC is not needed. Both are available through the same interface. For voice applications, FastRTC includes built-in voice activity detection that automatically figures out when the user has stopped speaking and hands that audio chunk to your function. This means you do not have to build your own silence detection to know when a speaker has finished a sentence before sending audio to a speech recognition model or an AI. A text-to-speech layer is also available as an optional install. Deployment options are flexible. You can launch a quick test interface built on Gradio with a single method call. You can mount the stream onto an existing FastAPI web server to integrate it into a larger application. There is also a method that gives you a temporary phone number so someone can call into your stream over a regular phone line, which requires a Hugging Face account. The README includes several demo applications built with the library: real-time voice conversations with models like Gemini, ChatGPT, and Claude, live speech transcription using Whisper, webcam object detection, and a voice-controlled code editor. Each example links to a live demo and its source code on Hugging Face Spaces. The library is published on PyPI and installs with pip.

Copy-paste prompts

Prompt 1
Show me a FastRTC example that captures microphone audio, sends it to OpenAI Whisper for transcription, and streams the transcript back to the browser in real time.
Prompt 2
I have a FastAPI app. How do I mount FastRTC onto it so I can add a real-time voice endpoint without replacing my existing routes?
Prompt 3
Write a FastRTC handler that does real-time webcam object detection using a YOLOv8 model and streams annotated frames back to the browser.
Prompt 4
How do I use FastRTC's built-in voice activity detection so my function only receives audio when the user has finished speaking, not on every raw chunk?
Prompt 5
How do I get a phone number that lets users call into my FastRTC voice assistant over a regular phone line using Hugging Face?
Open on GitHub → Explain another repo

← gradio-app on gitmyhub — every repo by this author, as a profile.

Verify against the repo before relying on details.