jianchang512/pyvideotrans

Analysis updated 2026-06-24

★ 17,387PythonAudience · generalComplexity · 4/5Setup · hard

Mindmap

mindmap
  root((pyvideotrans))
    Inputs
      Source video file
      Target language
      ASR engine choice
      TTS engine choice
    Outputs
      Dubbed video
      Translated subtitles
      Cloned voice tracks
    Use Cases
      Translate a tutorial into English
      Dub a YouTube clip with cloned voice
      Batch translate videos on a server
    Tech Stack
      Python
      Faster-Whisper
      Edge-TTS
      CUDA

mindmap root((pyvideotrans)) Inputs Source video file Target language ASR engine choice TTS engine choice Outputs Dubbed video Translated subtitles Cloned voice tracks Use Cases Translate a tutorial into English Dub a YouTube clip with cloned voice Batch translate videos on a server Tech Stack Python Faster-Whisper Edge-TTS CUDA

Click or tap to explore — scroll the page freely

What do people build with it?

USE CASE 1

Translate a Chinese tutorial video into English with new dubbed audio

USE CASE 2

Generate translated subtitles for an existing video for accessibility

USE CASE 3

Dub a clip using a cloned voice via F5-TTS or GPT-SoVITS

USE CASE 4

Batch translate a folder of videos on a GPU server from the command line

What is it built with?

PythonFaster-WhisperEdge-TTSCUDA

How does it compare?

	jianchang512/pyvideotrans	dortania/opencore-legacy-patcher	allenai/olmocr
Stars	17,387	17,387	17,320
Language	Python	Python	Python
Setup difficulty	hard	moderate	hard
Complexity	4/5	3/5	4/5
Audience	general	general	researcher

Figures from each repo's GitHub metadata at analysis time.

How do you get it running?

Difficulty · hard Time to first run · 1h+

Local models are large and benefit from a CUDA GPU, cloud translation needs API keys for DeepSeek, OpenAI, or similar.

In plain English

pyVideoTrans is an open-source tool that automatically translates videos from one language to another, replacing the original speech with dubbed audio in a new language and generating translated subtitles, all in one workflow. The process works in four steps: first, it listens to the video's speech and converts it to text (ASR, or Automatic Speech Recognition), next, it translates that text into the target language using an AI language model, then it generates new spoken audio from the translated text (TTS, or Text-to-Speech), and finally it combines everything back into a finished video. You can pause and manually correct any step along the way before moving on. The tool supports a wide range of speech recognition engines, including local offline models (Faster-Whisper) and cloud services. For translation, it connects to AI models like DeepSeek, ChatGPT, Claude, Gemini, and Ollama (for fully local, offline translation). For voice generation, it supports options including Microsoft's Edge-TTS (free) and voice cloning models like F5-TTS, CosyVoice, and GPT-SoVITS, which can clone a specific person's voice style. Additional features include speaker diarization (identifying who is speaking when), multi-role dubbing (different AI voices for different speakers), vocal separation, and a command-line interface for batch processing on servers. Windows users can download a ready-to-run executable with no setup. Developers on any platform can run it from source using Python. GPU acceleration via CUDA is optional but speeds up local AI models significantly.

Copy-paste prompts

Prompt 1

Walk me through running pyVideoTrans on Windows to dub a Chinese MP4 into English with Edge-TTS

Prompt 2

Show me how to point pyVideoTrans at a local Ollama model for the translation step

Prompt 3

Set up pyVideoTrans with Faster-Whisper on a CUDA GPU and explain how to verify it is using the card

Prompt 4

Use the pyVideoTrans CLI to batch process a folder of videos overnight with speaker diarization on

Prompt 5

Show me how to clone a speaker voice in pyVideoTrans with GPT-SoVITS and reuse it across multiple videos

Frequently asked questions

What is pyvideotrans?

Desktop tool that translates videos end to end by transcribing speech, translating the text with an LLM, dubbing new audio, and merging subtitles.

What language is pyvideotrans written in?

Mainly Python. The stack also includes Python, Faster-Whisper, Edge-TTS.

How hard is pyvideotrans to set up?

Setup difficulty is rated hard, with roughly 1h+ to a first successful run.

Who is pyvideotrans for?

Mainly general.

Open on GitHub → Explain another repo

This repo across BitVibe Labs

Scan in gitsafehub Deploy in gitdeployhub jianchang512 on gitmyhub

Verify against the repo before relying on details.