Generate a talking-head video from a single photo and a speech audio file for a digital avatar or presentation.
Create animated spokespersons for video content without filming real people, using just a photo and script audio.
Test EchoMimicV2 through the Gradio web interface without writing any Python code.
Integrate EchoMimicV2 into a ComfyUI visual workflow for automated talking-head video production.
Requires a high-end GPU such as an A100 and downloading multi-gigabyte model weights from Hugging Face before first use.
EchoMimicV2 is an AI system developed by researchers at Ant Group (the company behind Alipay) that generates video animations of a person talking from just a still photo and an audio clip. You provide a reference image and a speech recording, and the system produces a video where the person in the image appears to speak, with lips, head, and upper body moving in sync with the audio. It covers more than just the face: it animates the upper half of the body including shoulder and hand movement, which the researchers call semi-body animation. The work was accepted at CVPR 2025, one of the top computer vision research conferences. The system supports both English and Chinese audio input. Standard inference takes roughly 7 minutes to produce 120 frames of video, an accelerated version released in January 2025 cuts that to about 50 seconds on a high-end A100 GPU, a 9x improvement. A Gradio web interface lets users test it without writing Python code, and a ComfyUI integration is available for those who prefer that visual workflow tool. The process internally aligns the reference image with pose information extracted from a driving video, then generates the final animated output. The repository includes model weights hosted on Hugging Face and ModelScope, inference scripts, a Jupyter notebook demo, and the training dataset list along with processing scripts. This is a research release aimed at people working in AI video generation, digital avatar creation, or related areas. It is not a simple consumer application: setup requires installing several Python dependencies and downloading multi-gigabyte model weights. The README links to installation tutorials and a community discussion thread covering common setup problems.
← antgroup on gitmyhub — every repo by this author, as a profile.
Verify against the repo before relying on details.