freddier/audio-transcript

★ 19PythonAudience · generalComplexity · 1/5LicenseSetup · moderate

Mindmap

mindmap
  root((audio-transcript))
    Input
      Single audio file
      Folder of files
      Auto MP3 conversion
    Transcription
      OpenAI speech to text
      High quality model
      Fast cheap model
    Long Files
      Auto chunking
      5 min default chunks
      Adjustable chunk size
    Options
      Language code hint
      Context keywords
      Save to txt file
    Requirements
      Python 3.10 plus
      OpenAI API key
      ffmpeg installed

mindmap root((audio-transcript)) Input Single audio file Folder of files Auto MP3 conversion Transcription OpenAI speech to text High quality model Fast cheap model Long Files Auto chunking 5 min default chunks Adjustable chunk size Options Language code hint Context keywords Save to txt file Requirements Python 3.10 plus OpenAI API key ffmpeg installed

Click or tap to explore — scroll the page freely

Things people build with this

USE CASE 1

Transcribe recorded meetings, interviews, or podcasts into text you can search or edit.

USE CASE 2

Batch-convert a whole folder of voice memos or lecture recordings to text files automatically.

USE CASE 3

Transcribe audio in languages other than English with improved accuracy using a language hint.

USE CASE 4

Keep transcription costs low on large audio libraries by switching to the faster, cheaper model.

Tech stack

PythonOpenAI APIffmpeg

Getting it running

Difficulty · moderate Time to first run · 30min

Requires Python 3.10+, an OpenAI API key set as an environment variable, and ffmpeg installed on your machine. README includes setup steps and a troubleshooting section.

MIT license, free to use, modify, and share for any purpose, including commercial projects. Just keep the original license notice.

In plain English

Audio Transcript is a small command-line tool that converts audio files into text using OpenAI's speech-to-text service. You point it at an audio file or a folder full of audio files, and it prints the transcript in your terminal. That is the whole thing. By default it uses OpenAI's higher-quality transcription model. If you have a lot of audio and want to keep costs and processing time down, a single flag switches it to a faster, cheaper model. You can also pass a language code to improve accuracy when the spoken language is not English, or give the model a short context hint with names, acronyms, or topic keywords that might otherwise get transcribed incorrectly. Long audio files are handled by splitting them into chunks before uploading. Audio APIs have limits on both file size and how much text they return in one response, so splitting avoids transcripts that cut off partway through. The default chunk length is five minutes, and you can adjust it downward if you still get incomplete results. Files in formats that the API does not accept directly are automatically converted to MP3 using ffmpeg before they are sent. Transcripts print to the terminal by default. Passing a flag saves a .txt file next to each audio file instead. The tool does not do anything else: no database, no web interface, no account system. To use it you need Python 3.10 or later, an OpenAI API key, and ffmpeg installed on your machine. The README includes clear setup instructions and a troubleshooting section covering the most common failure cases. The code is released under the MIT license.

Copy-paste prompts

Prompt 1

I have an audio file called interview.mp3. Using the audio-transcript CLI tool, what exact command do I run to transcribe it and save the result as a .txt file next to the audio?

Prompt 2

My audio recording is in Spanish. How do I use audio-transcript to get a more accurate transcription by specifying the language, and what flag do I pass?

Prompt 3

I have a 2-hour podcast episode I want to transcribe with audio-transcript. How does chunking work, and how do I adjust the chunk size if my transcript keeps cutting off?

Prompt 4

I want to transcribe a whole folder of MP3s cheaply and quickly using audio-transcript. What command switches to the faster, lower-cost model and processes all files in a directory?

Prompt 5

I have WAV files that the OpenAI API doesn't accept directly. Will audio-transcript handle the conversion automatically, and what do I need installed for that to work?

Open on GitHub → Explain another repo

← freddier on gitmyhub — every repo by this author, as a profile.

Verify against the repo before relying on details.