Analysis updated 2026-06-21
Transcribe an interview recording to text on your own computer without uploading to any cloud service.
Generate subtitle files (SRT or VTT) for a video automatically from the spoken audio.
Caption a presentation or meeting recording offline when privacy is important.
Automate transcription of a folder of audio files using the built-in watch folder feature.
| chidiwilliams/buzz | swe-agent/swe-agent | iperov/deepfacelab | |
|---|---|---|---|
| Stars | 19,205 | 19,210 | 19,188 |
| Language | Python | Python | Python |
| Setup difficulty | moderate | moderate | hard |
| Complexity | 3/5 | 4/5 | 4/5 |
| Audience | general | researcher | developer |
Figures from each repo's GitHub metadata at analysis time.
GPU acceleration requires CUDA (Nvidia) or a Vulkan-compatible GPU, CPU-only mode works but is significantly slower.
Buzz is a free, offline desktop app that converts spoken audio into text (transcription) and can translate it into other languages, all without sending your audio to any external server. It runs entirely on your own computer, which is useful if privacy matters or you do not have a reliable internet connection. It is powered by OpenAI's Whisper, an open-source speech recognition model. You can use it to transcribe audio and video files, YouTube links, or live audio captured from your microphone in real time. The transcription viewer lets you search the text, control playback speed, and export results as TXT, SRT (subtitle format), or VTT files. It also supports speaker identification, background noise separation for better accuracy, and a watch folder feature that automatically transcribes new files as they appear. GPU acceleration is supported for Nvidia cards (via CUDA), Apple Silicon Macs, and most other GPUs via Vulkan. You would reach for Buzz any time you need to turn audio into text on your own machine, captioning a presentation, transcribing an interview, creating subtitles for a video, or building automation scripts via its command-line interface. It runs on macOS, Windows, and Linux.
A free offline desktop app that converts spoken audio and video into text using OpenAI's Whisper AI model, running entirely on your own computer with no data sent to any external server.
Mainly Python. The stack also includes Python, Whisper, CUDA.
Setup difficulty is rated moderate, with roughly 30min to a first successful run.
Mainly general.
This repo across BitVibe Labs
Verify against the repo before relying on details.