magic-research/magic-animate

★ 10,903PythonAudience · researcherComplexity · 4/5Setup · hard

Mindmap

mindmap
  root((MagicAnimate))
    What it does
      Human animation
      Photo to video
      Motion transfer
    Inputs
      Reference photo
      DensePose sequence
    Tech Stack
      Python PyTorch
      Stable Diffusion
    Setup
      Hugging Face models
      GPU required
      ffmpeg needed

mindmap root((MagicAnimate)) What it does Human animation Photo to video Motion transfer Inputs Reference photo DensePose sequence Tech Stack Python PyTorch Stable Diffusion Setup Hugging Face models GPU required ffmpeg needed

Click or tap to explore — scroll the page freely

Things people build with this

USE CASE 1

Animate a portrait photo to follow a dance or body movement sequence using DensePose pose data

USE CASE 2

Generate consistent person animations for creative video projects without motion capture hardware

USE CASE 3

Experiment with human video generation research using pretrained appearance encoder models

USE CASE 4

Try human animation in the browser via the Hugging Face Spaces demo without any local setup

Tech stack

PythonPyTorch

Getting it running

Difficulty · hard Time to first run · 1h+

Requires a compatible GPU, ffmpeg, and downloading several large pretrained model files from Hugging Face before any inference can run.

In plain English

MagicAnimate is a research project from the National University of Singapore and ByteDance that turns a still photo of a person into a short video by making the person follow a sequence of body movements. You provide two inputs: a reference image of the person you want to animate, and a motion sequence (a series of skeleton or pose frames representing how a body should move). The system then generates a video where the person in your photo performs those movements while keeping their appearance consistent across frames. The technique is built on top of Stable Diffusion, a widely used image generation system. Stable Diffusion works by learning to produce images from a kind of structured randomness, and researchers have found it can be extended to produce video frames as well. MagicAnimate adds specific components for locking in the person's appearance across time and for reading the body pose information from a format called DensePose, which maps body positions onto a surface representation of a human figure. To use the code, you need a machine with a compatible GPU, Python 3.8 or higher, and video-processing software called ffmpeg. Setup involves downloading several pretrained model files from Hugging Face (a platform for hosting AI models) and placing them in a specific directory structure before running the included scripts. The README provides step-by-step folder layout instructions and commands for running on one or multiple GPUs. There is also an online demo hosted on Hugging Face Spaces where you can try it without installing anything. This work was presented at the CVPR 2024 conference, which is one of the main academic conferences for computer vision research. The code was released in late 2023. The repository does not state a license in the README beyond linking the paper and models.

Copy-paste prompts

Prompt 1

Help me set up the MagicAnimate repository, download the required pretrained models from Hugging Face, and arrange the directory structure before running inference

Prompt 2

Using MagicAnimate, write a script that takes a reference image and a DensePose motion sequence and outputs an animated video with consistent appearance across all frames

Prompt 3

Help me run MagicAnimate on two GPUs using the multi-GPU inference script and explain what the appearance encoder and temporal attention modules are doing

Prompt 4

Set up ffmpeg and the MagicAnimate environment on Ubuntu with a single A100 GPU to animate a custom portrait image with a custom motion sequence

Open on GitHub → Explain another repo

← magic-research on gitmyhub — every repo by this author, as a profile.

Verify against the repo before relying on details.