explaingit

babelive/windows

2C#Audience · developerComplexity · 3/5ActiveSetup · moderate

TLDR

Windows desktop app that captures system audio, sends it to OpenAI realtime translation, and plays translated voice plus side-by-side transcripts in an always-on-top overlay.

Mindmap

mindmap
  root((babelive-windows))
    Inputs
      System audio loopback
      OpenAI API key
    Outputs
      Translated voice
      Bilingual transcript overlay
      WAV and SRT recordings
    Use Cases
      Live translate meetings
      Translate YouTube videos
      Record bilingual subtitles
    Tech Stack
      C Sharp
      Avalonia
      DotNet 9
      WebSocket

Things people build with this

USE CASE 1

Translate Teams or Zoom meetings live with a side-by-side transcript

USE CASE 2

Get translated voiceover for YouTube and Spotify on Windows

USE CASE 3

Record paired WAV and SRT files for a translated session

USE CASE 4

Route blocked app audio through VB-CABLE for Teams or Skype

Tech stack

C#.NET 9AvaloniaWebSocketOpenAI

Getting it running

Difficulty · moderate Time to first run · 30min

Requires .NET 9 SDK and an OpenAI API key with access to gpt-realtime-translate.

In plain English

Babelive is a Windows desktop app, written in C# on .NET 9 and the Avalonia user-interface toolkit, that listens to whatever your computer is currently playing and translates it in real time. The audio can come from any program: a video call, a YouTube clip, a Spotify song, a Teams meeting. It captures the sound, sends it to an OpenAI realtime translation model called gpt-realtime-translate, plays the translated voice back through a speaker of your choice, and shows the original and translated text side by side in a floating, always-on-top lyric-style panel at the bottom of the screen. The README walks through the audio pipeline using a small diagram: Windows audio is captured with a loopback recorder, downmixed and resampled from 48 kHz to 24 kHz, packed into PCM16 format, and sent over a WebSocket connection to the OpenAI service. The model sends back translated audio plus running transcript snippets in both languages, which the app routes to the chosen playback device and to the overlay window. To install and run it, a user needs Windows 10 or 11, the .NET 9 SDK, and an OpenAI API key with access to the translation model. The commands "dotnet restore" then "dotnet run" build and start the app. A "dotnet publish" command produces a single self-contained Babelive.exe of roughly 80 to 90 MB. The API key is saved locally in a settings.json file and is only sent to the configured API endpoint. The interface lets you pick a target language, choose a capture source (all system audio or a specific app), pick a playback device, and adjust sliders for source ducking and translation volume. There are toggles for transcript-only mode, an alternate endpoint, echo suppression, and source muting. A record button writes timestamped folders containing the source audio as a WAV file, the translated audio as another WAV file, and matching SRT subtitle files for both languages. The README notes that Windows built-in players ignore external SRT for WAV files and suggests VLC or mpv for playback. A special section explains that Microsoft Teams and Skype block normal loopback for privacy, so Babelive can route their audio through VB-CABLE if it is installed. The README also mentions that the audio layer is being abstracted for a future macOS port.

Copy-paste prompts

Prompt 1
Build babelive locally with dotnet 9 and configure my OpenAI key for the realtime translation model
Prompt 2
Walk me through the 48 kHz to 24 kHz PCM16 audio pipeline and the WebSocket frame format
Prompt 3
Help me set up VB-CABLE so Babelive can capture Microsoft Teams audio
Prompt 4
Show me how to add a new target language to the picker UI
Open on GitHub → Explain another repo

Generated 2026-05-22 · Model: sonnet-4-6 · Verify against the repo before relying on details.