HY-Motion 1.0 is an AI model from Tencent that generates 3D human motion animations from text descriptions. You describe an action in plain English, for example, "a person performs a squat, then pushes a barbell overhead", and the model produces a sequence of skeleton poses representing that movement. The output can be imported directly into 3D animation workflows. The model uses a technique called Diffusion Transformer (DiT) combined with Flow Matching, which is a method for generating high-quality outputs by gradually refining a noisy signal. It is trained in three stages: first on over 3,000 hours of diverse motion capture data, then fine-tuned on 400 hours of curated high-quality recordings, then further refined using reinforcement learning from human feedback. Two model sizes are available, a standard 1-billion-parameter version and a lighter 0.46-billion-parameter version. Both require at least 24 GB of GPU video memory (VRAM) to run locally. The generated animations work with humanoid characters only, it does not support animals, multi-person interactions, or environment-dependent movements. Prompts work best in English and should focus on physical actions and limb movements rather than emotions, clothing, or camera angles. You can run it from the command line for batch processing of many prompts, or through an interactive browser-based interface built with Gradio. Model weights are downloaded separately from Hugging Face. It is written in Python and released by Tencent's Hunyuan research team. The full README is longer than what was provided.
← tencent-hunyuan on gitmyhub — every repo by this author, as a profile.
Verify against the repo before relying on details.