Reproduce the SUGAR paper results on the six demo tasks CarryBox, KickBox, PushBox, SitChair, StandBottle, and PickBottle.
Run inference.sh on a task name with the released tracker and generator checkpoints to see a policy run in IsaacSim.
Train a new policy with train.sh for a task and an experiment name from scratch.
Reuse the sugar_rl or sugar_il packages to build on the unitree_rl_lab and DexGraspVLA stacks.
Setup needs IsaacSim 5.1.0, IsaacLab 2.3.0 with a flatdict pin, a recent NVIDIA GPU with CUDA, and three large Google Drive downloads.
SUGAR is a research code release from a team at Peking University and Beihang University that trains humanoid robots to perform whole-body manipulation tasks by learning from third-person videos of humans interacting with objects. The acronym, expanded in the README, stands for a Scalable Human-Video-Driven Generalizable Humanoid Loco-Manipulation Learning Framework, and the project has an accompanying arXiv paper and a demo website. The framework is built on top of IsaacLab, a manager-based simulation framework from NVIDIA that runs inside IsaacSim. Given the human-interaction videos as input, the pipeline learns autonomous control policies for a humanoid robot that the authors describe as deployable to the real world. The current code release covers six example tasks: CarryBox, KickBox, PushBox, SitChair, StandBottle, and PickBottle. Installation is involved. A user is asked to create a conda environment with Python 3.11, install IsaacSim 5.1.0 from the NVIDIA package index, clone and check out IsaacLab at version 2.3.0 with a specific flatdict pin and then run an isaaclab.sh script to install rsl_rl, and finally install the project's own two Python packages, sugar_rl and sugar_il, in editable mode. RTX 5090 users get a separate torch 2.8.0 install line targeting the CUDA 12.8 wheels. Three data archives are downloaded from Google Drive using gdown: a 400 MB main data zip, a 50 MB descriptions zip, and a 250 MB demo checkpoints zip. After setup, the README provides two shell scripts. inference.sh takes a task name, with optional tracker and generator checkpoint paths, and runs the demo policy. train.sh takes a task name and an optional experiment name to train from scratch. The TODO list says inference checkpoints, the full training pipeline including refiner, tracker, and generator, and processed data for all six tasks are already released, while a data processing pipeline that converts RGB-D human videos into training data and a sim-to-sim transfer pipeline are still to come. The code reuses two upstream codebases acknowledged in the README: unitree_rl_lab together with beyondmimic for the sugar_rl reinforcement learning component, and DexGraspVLA for the sugar_il imitation learning component. The project is released under the MIT license.
Generated 2026-05-22 · Model: sonnet-4-6 · Verify against the repo before relying on details.