explaingit

aliaksandrsiarohin/first-order-model

Analysis updated 2026-06-24

15,003Jupyter NotebookAudience · researcherComplexity · 5/5Setup · hard

TLDR

Research code for the First Order Motion Model paper that animates a still image to follow the motion of a separate driving video.

Mindmap

mindmap
  root((first-order-model))
    Inputs
      Source image
      Driving video
      Pretrained checkpoint
    Outputs
      Animated MP4
      Predicted keypoints
    Use Cases
      Animate a portrait
      Reenact a face
      Generate fashion clips
    Tech Stack
      Python
      PyTorch
      CUDA
      ffmpeg
      Docker
Click or tap to explore — scroll the page freely

Code map

Detail Auto

An interactive map of this repo's files and how they connect — its source is parsed live in your browser. Click Visualize to build it.

filefunction / class

What do people build with it?

USE CASE 1

Animate a portrait photo to mimic the motion of a talking-head video

USE CASE 2

Reproduce the NeurIPS first order motion paper results on VoxCeleb

USE CASE 3

Generate a fashion clip from a single product photo and a runway video

USE CASE 4

Train a custom motion model on your own dataset such as Taichi or Bair

What is it built with?

PythonPyTorchCUDAffmpegDockerJupyter

How does it compare?

aliaksandrsiarohin/first-order-modelgoogle-deepmind/deepmind-researchgraykode/nlp-tutorial
Stars15,00314,92314,897
LanguageJupyter NotebookJupyter NotebookJupyter Notebook
Setup difficultyhardhardmoderate
Complexity5/55/53/5
Audienceresearcherresearcherresearcher

Figures from each repo's GitHub metadata at analysis time.

How do you get it running?

Difficulty · hard Time to first run · 1day+

Needs a CUDA GPU, pre-trained weights downloaded from Google Drive or Yandex, and matching PyTorch and CUDA versions, Docker image helps when local libs conflict.

In plain English

This repository contains the research code for an academic paper called First Order Motion Model for Image Animation, published at NeurIPS by Aliaksandr Siarohin and four co-authors. The point of the project is to take one still image of a subject and a separate video of something or someone moving, then produce a new video in which the subject from the still image moves the way the subject in the driving video does. The README shows example results for three datasets: VoxCeleb of talking faces, a Fashion dataset of people modeling clothes, and an MGIF dataset. The technique works without knowing in advance what the subject is. You do not have to label parts of a face or a body. The model learns its own set of keypoints during training and then transfers the motion of those keypoints from the driving video onto the source image. The README describes two ways to run the animation step: an absolute mode that copies the driving keypoints directly, and a relative mode that uses only the change in keypoint positions and applies that change to the source. The relative mode usually looks better but needs the first frame of the driving video to have a similar pose to the source image. The project is written in Python 3 with PyTorch, with one YAML configuration file per dataset. The README explains how to install dependencies, how to download pre-trained model files from Google Drive or Yandex Disk, and how to run a single command-line demo that produces a result.mp4 file. A helper script suggests crop boxes for a YouTube video using ffmpeg, and there is also a Docker image with nvidia-docker support if local library versions cause trouble. There are notebooks for running the demo on Google Colab and Kaggle, including a graphical interface contributed by a community user. The README also covers training your own model on one of the supported datasets such as Bair, MGIF, Fashion, and Taichi, evaluating reconstruction quality, and points to a follow-up project that adds support for face-swap and articulated objects.

Copy-paste prompts

Prompt 1
Run first-order-model on Colab with my own photo and a 10 second driving video
Prompt 2
Explain the difference between absolute and relative animation modes in this repo
Prompt 3
Train first-order-model on a custom dataset of pet faces using one GPU
Prompt 4
Write a wrapper script that takes a YouTube URL, crops the face, and animates a source photo
Prompt 5
Why are my generated frames blurry in first-order-model and which config keys should I tune

Frequently asked questions

What is first-order-model?

Research code for the First Order Motion Model paper that animates a still image to follow the motion of a separate driving video.

What language is first-order-model written in?

Mainly Jupyter Notebook. The stack also includes Python, PyTorch, CUDA.

How hard is first-order-model to set up?

Setup difficulty is rated hard, with roughly 1day+ to a first successful run.

Who is first-order-model for?

Mainly researcher.

Open on GitHub → Explain another repo

This repo across BitVibe Labs

Verify against the repo before relying on details.