rudrabha/wav2lip

★ 12,984PythonAudience · researcherComplexity · 3/5LicenseSetup · moderate

Mindmap

mindmap
  root((wav2lip))
    What it does
      Lip sync video
      Cross-language dub
      Animated faces
    Usage Paths
      Local open source
      Commercial API
    Setup
      Model checkpoints
      Python deps
    Audience
      Researchers
      Content creators

mindmap root((wav2lip)) What it does Lip sync video Cross-language dub Animated faces Usage Paths Local open source Commercial API Setup Model checkpoints Python deps Audience Researchers Content creators

Click or tap to explore — scroll the page freely

Things people build with this

USE CASE 1

Dub a video into a different language by providing new audio while keeping the original speaker's face.

USE CASE 2

Replace or retime speech in a talking-head video so mouth movements match new recorded audio.

USE CASE 3

Run lip-sync inference on custom video and audio pairs using the provided pretrained model weights.

Tech stack

PythonPyTorch

Getting it running

Difficulty · moderate Time to first run · 1h+

Requires downloading pretrained model checkpoint files separately and a Python environment with compatible PyTorch and CUDA versions.

The open-source code is available for non-commercial research use only, commercial use requires the separate Sync Labs API product.

In plain English

Wav2Lip is a research project that automatically synchronizes the lip movements in a video to match a separate audio track. In plain terms, you give it a video of a person talking and a different audio file, and it produces a new video where the person's mouth moves to match the new audio. This can work across different languages, voices, and identities, including animated or computer-generated faces. The project came out of a research paper published at ACM Multimedia 2020, and the repository contains the full training and inference code along with pretrained model weights. For someone who wants to try it without writing code, a Google Colab notebook is provided, which lets you run the process in a browser using cloud computing resources without installing anything locally. The README describes two separate paths for using the technology. The first is the original open-source version, which is free for non-commercial use and requires setting up Python, downloading pretrained models, and running inference scripts locally. The second is a commercial API offered by Sync Labs (sync.so), which the README now promotes prominently as a higher-quality option. The commercial version requires creating an account, getting an API key, and calling the API from Python or TypeScript code. The two paths are independent. For the open-source path, the setup involves downloading model checkpoints, installing Python dependencies, and running a command-line script that takes a video file and an audio file as inputs. The output is a video file with the lips resynced. The README also covers how to train the model from scratch using your own data, and how to evaluate the quality of results. The open-source code is available for research use. The commercial Sync Labs product operates under separate terms from the original research code.

Copy-paste prompts

Prompt 1

Using the Wav2Lip open-source model, show me the command to sync lips in a video to a new audio file.

Prompt 2

How do I set up the Wav2Lip inference script locally, including where to download the pretrained model checkpoints?

Prompt 3

What Python packages does Wav2Lip require, and how do I install them to avoid dependency conflicts?

Prompt 4

Can I use Wav2Lip on an animated cartoon video? How do I run inference on non-photorealistic faces?

Prompt 5

How do I evaluate the quality of Wav2Lip results, what metrics and evaluation scripts does the repo provide?

Open on GitHub → Explain another repo

← rudrabha on gitmyhub — every repo by this author, as a profile.

Verify against the repo before relying on details.