aliaksandrsiarohin/first-order-model

Analysis updated 2026-06-24

★ 15,003Jupyter NotebookAudience · researcherComplexity · 5/5Setup · hard

Mindmap

mindmap
  root((first-order-model))
    Inputs
      Source image
      Driving video
      Pretrained checkpoint
    Outputs
      Animated MP4
      Predicted keypoints
    Use Cases
      Animate a portrait
      Reenact a face
      Generate fashion clips
    Tech Stack
      Python
      PyTorch
      CUDA
      ffmpeg
      Docker

mindmap root((first-order-model)) Inputs Source image Driving video Pretrained checkpoint Outputs Animated MP4 Predicted keypoints Use Cases Animate a portrait Reenact a face Generate fashion clips Tech Stack Python PyTorch CUDA ffmpeg Docker

Click or tap to explore — scroll the page freely

What do people build with it?

USE CASE 1

Animate a portrait photo to mimic the motion of a talking-head video

USE CASE 2

Reproduce the NeurIPS first order motion paper results on VoxCeleb

USE CASE 3

Generate a fashion clip from a single product photo and a runway video

USE CASE 4

Train a custom motion model on your own dataset such as Taichi or Bair

What is it built with?

PythonPyTorchCUDAffmpegDockerJupyter

How does it compare?

	aliaksandrsiarohin/first-order-model	google-deepmind/deepmind-research	graykode/nlp-tutorial
Stars	15,003	14,923	14,897
Language	Jupyter Notebook	Jupyter Notebook	Jupyter Notebook
Setup difficulty	hard	hard	moderate
Complexity	5/5	5/5	3/5
Audience	researcher	researcher	researcher

Figures from each repo's GitHub metadata at analysis time.

How do you get it running?

Difficulty · hard Time to first run · 1day+

Needs a CUDA GPU, pre-trained weights downloaded from Google Drive or Yandex, and matching PyTorch and CUDA versions, Docker image helps when local libs conflict.

In plain English

This repository contains the research code for an academic paper called First Order Motion Model for Image Animation, published at NeurIPS by Aliaksandr Siarohin and four co-authors. The point of the project is to take one still image of a subject and a separate video of something or someone moving, then produce a new video in which the subject from the still image moves the way the subject in the driving video does. The README shows example results for three datasets: VoxCeleb of talking faces, a Fashion dataset of people modeling clothes, and an MGIF dataset. The technique works without knowing in advance what the subject is. You do not have to label parts of a face or a body. The model learns its own set of keypoints during training and then transfers the motion of those keypoints from the driving video onto the source image. The README describes two ways to run the animation step: an absolute mode that copies the driving keypoints directly, and a relative mode that uses only the change in keypoint positions and applies that change to the source. The relative mode usually looks better but needs the first frame of the driving video to have a similar pose to the source image. The project is written in Python 3 with PyTorch, with one YAML configuration file per dataset. The README explains how to install dependencies, how to download pre-trained model files from Google Drive or Yandex Disk, and how to run a single command-line demo that produces a result.mp4 file. A helper script suggests crop boxes for a YouTube video using ffmpeg, and there is also a Docker image with nvidia-docker support if local library versions cause trouble. There are notebooks for running the demo on Google Colab and Kaggle, including a graphical interface contributed by a community user. The README also covers training your own model on one of the supported datasets such as Bair, MGIF, Fashion, and Taichi, evaluating reconstruction quality, and points to a follow-up project that adds support for face-swap and articulated objects.

Copy-paste prompts

Prompt 1

Run first-order-model on Colab with my own photo and a 10 second driving video

Prompt 2

Explain the difference between absolute and relative animation modes in this repo

Prompt 3

Train first-order-model on a custom dataset of pet faces using one GPU

Prompt 4

Write a wrapper script that takes a YouTube URL, crops the face, and animates a source photo

Prompt 5

Why are my generated frames blurry in first-order-model and which config keys should I tune

Frequently asked questions

What is first-order-model?

Research code for the First Order Motion Model paper that animates a still image to follow the motion of a separate driving video.

What language is first-order-model written in?

Mainly Jupyter Notebook. The stack also includes Python, PyTorch, CUDA.

How hard is first-order-model to set up?

Setup difficulty is rated hard, with roughly 1day+ to a first successful run.

Who is first-order-model for?

Mainly researcher.

Open on GitHub → Explain another repo

This repo across BitVibe Labs

Verify against the repo before relying on details.