Analysis updated 2026-05-18
Run the CLI on a YouTube lecture URL to get a timestamped transcript JSON you can load into a search index or RAG pipeline.
Build a growing dataset of video transcripts by pointing the tool at multiple URLs with --output to append each result to the same file.
Extract metadata like view count, tags, and upload date from a set of competitor videos for SEO research.
| kartikeysepta/youtube-transcript-scraper | a-bissell/unleash-lite | abhiinnovates/whatsapp-hr-assistant | |
|---|---|---|---|
| Stars | 1 | 1 | 1 |
| Language | Python | Python | Python |
| Setup difficulty | moderate | hard | hard |
| Complexity | 2/5 | 4/5 | 3/5 |
| Audience | data | researcher | developer |
Figures from each repo's GitHub metadata at analysis time.
Requires FFmpeg installed on your system PATH, Gemini mode also needs a Google AI Studio API key added directly to the script.
This is a Python command-line tool that takes a YouTube video link and gives you back a structured JSON file containing the video's metadata and a full timestamped transcript. You run a single command with the video URL and it handles the rest: downloading the audio, converting it to MP3, pulling the channel and video details, and transcribing the spoken words with timestamps. Transcription can happen in two ways. The default method runs a speech-to-text model called Faster Whisper directly on your machine. This works without any external account or API key, and it can use a graphics card to speed up processing if you have one available. The second option sends the audio to Google Gemini, a cloud AI service, which requires a free Google AI Studio API key that you add directly to the script. The output JSON includes fields like title, channel name, view count, like count, upload date, tags, categories, and the full transcript with time markers so you can find specific moments. When you point the tool at an existing JSON file, it appends the new result rather than overwriting, making it easy to build a growing dataset across multiple videos. The README lists several practical purposes: building a searchable archive of video transcripts, extracting metadata for SEO or content analysis, creating datasets for AI search systems, and converting long lectures or interviews into text you can work with in other tools. Setup requires Python, FFmpeg installed on your computer, and cloning the repository. The API key for Gemini mode is currently hardcoded in the script rather than read from an environment variable, which the contributing notes flag as a known improvement to make. The project has no license file yet, so the terms for reuse are unclear until one is added.
A Python CLI that downloads a YouTube video's audio, transcribes it locally with Whisper or via Google Gemini, and outputs a JSON file with full metadata and timestamped transcript.
Mainly Python. The stack also includes Python, yt-dlp, FFmpeg.
No license file has been added yet, terms for reuse are unclear until one is added.
Setup difficulty is rated moderate, with roughly 30min to a first successful run.
Mainly data.
This repo across BitVibe Labs
Verify against the repo before relying on details.