Transcribe interviews, podcasts, or meetings into searchable text files on your computer.
Create subtitles for videos by transcribing audio and exporting as SRT or VTT format.
Automatically transcribe new audio files as they arrive in a folder using the watch feature.
Build automation scripts that convert speech to text via the command-line interface.
Whisper model download and CUDA/GPU setup are the main bottlenecks; CPU fallback available but slower.
Buzz is a free, offline desktop app that converts spoken audio into text (transcription) and can translate it into other languages, all without sending your audio to any external server. It runs entirely on your own computer, which is useful if privacy matters or you do not have a reliable internet connection. It is powered by OpenAI's Whisper, an open-source speech recognition model. You can use it to transcribe audio and video files, YouTube links, or live audio captured from your microphone in real time. The transcription viewer lets you search the text, control playback speed, and export results as TXT, SRT (subtitle format), or VTT files. It also supports speaker identification, background noise separation for better accuracy, and a watch folder feature that automatically transcribes new files as they appear. GPU acceleration is supported for Nvidia cards (via CUDA), Apple Silicon Macs, and most other GPUs via Vulkan. You would reach for Buzz any time you need to turn audio into text on your own machine, captioning a presentation, transcribing an interview, creating subtitles for a video, or building automation scripts via its command-line interface. It runs on macOS, Windows, and Linux.
Generated 2026-05-18 · Model: sonnet-4-6 · Verify against the repo before relying on details.