explaingit

ammarkov/sam3dbody-cpp

Analysis updated 2026-05-18

563CAudience · researcherComplexity · 5/5Setup · hard

TLDR

A C++ real-time engine that reconstructs 3D full-body poses and hand poses from a single camera, exporting per-person BVH motion-capture files ready to use in Blender or other 3D software.

Mindmap

mindmap
  root((SAM3DBody-cpp))
    What it does
      3D body pose from one camera
      70-joint skeleton with hands
      Multi-person tracking
      BVH motion-capture export
    Pipeline stages
      YOLO person detection
      DINOv2 feature extraction
      Transformer decoder
      Linear blend skinning
    Outputs
      BVH files per person
      3D mesh vertices
      CSV joint coordinates
    Requirements
      CUDA GPU recommended
      ONNX Runtime
      5 GB model download
Click or tap to explore — scroll the page freely

Code map

Detail Auto

An interactive map of this repo's files and how they connect — its source is parsed live in your browser. Click Visualize to build it.

filefunction / class

What do people build with it?

USE CASE 1

Record a video of people moving and generate BVH motion-capture files to drive character animations in Blender without special motion-capture hardware.

USE CASE 2

Extract 70-joint 3D body and hand positions from a video clip and export them as CSV data for analysis in a research project.

USE CASE 3

Integrate the compiled shared library into a Python script using ctypes to get real-time 3D body pose data from a camera feed.

USE CASE 4

Use the multi-person tracking to capture synchronized BVH files for multiple performers in the same scene from a single ordinary camera.

What is it built with?

CC++ONNX RuntimeggmlCUDACMakePython

How does it compare?

ammarkov/sam3dbody-cppfractalfir/crustcfacex-engine/facex
Stars563331189
LanguageCCC
Setup difficultyhardhardmoderate
Complexity5/55/54/5
Audienceresearcherdeveloperdeveloper

Figures from each repo's GitHub metadata at analysis time.

How do you get it running?

Difficulty · hard Time to first run · 1day+

Requires a CUDA-capable GPU and downloading approximately 5 GB of ONNX model files from HuggingFace before building with CMake.

The README does not state a license directly, check the repository for a license file before use.

In plain English

SAM3DBody-cpp is a C++ program that takes video from a single ordinary camera and produces a three-dimensional model of every person's body and hands visible in each frame, in real time. It does not require depth cameras, multiple camera setups, special sensors, or any Python installation to run. The entire inference pipeline runs through compiled C++ code. What it produces is a standard motion-capture file format called BVH for each detected person in the video. These files record how every joint in the body, including the hands, moves from frame to frame. You can open these files directly in animation software like Blender, and a bundled Blender plugin is included to drive a character rig from the results. Each person in a multi-person scene gets their own BVH file, and the software keeps track of which person is which across frames so the identities do not swap. The system works by running a sequence of neural network models. A detection model finds people in the image. A large visual understanding model then analyzes each person's crop and produces a compact description of their pose. A final set of lightweight models decode that description into the 519 specific numbers that describe a full-body skeleton with hand poses and even basic facial expressions. A linear blend skinning step then converts those numbers into 18,439 surface points forming the body mesh, plus 70 labeled joint positions. Running the GPU-accelerated version requires a CUDA-capable graphics card and about 5 gigabytes of downloaded model files from HuggingFace. A CPU-only version exists but processes a single frame in 5 to 15 seconds depending on the computer, which is not practical for video. Building from source requires CMake and a C++ compiler, ONNX Runtime and ggml handle the model execution. Python frontends are included for users who want to call the compiled library from Python scripts without rewriting any C++ code. A CSV exporter is also available for the 70 joint positions if you need the data in spreadsheet form. The full README is longer than what was shown.

Copy-paste prompts

Prompt 1
I have a CUDA GPU and want to run SAM3DBody-cpp on a video file to get BVH motion-capture output for each person. Walk me through downloading the model files, building with CMake, and running the first test.
Prompt 2
I want to use the Python frontend of SAM3DBody-cpp to get 70 joint positions per frame from a video. Show me how to load the shared library with ctypes and call the inference function.
Prompt 3
How do I import the BVH files output by SAM3DBody-cpp into Blender and drive a MakeHuman character rig using the bundled plugin?
Prompt 4
I only have a CPU machine and want to run SAM3DBody-cpp on individual images rather than video. Which model files do I download, and what command-line flags do I use to disable CUDA?

Frequently asked questions

What is sam3dbody-cpp?

A C++ real-time engine that reconstructs 3D full-body poses and hand poses from a single camera, exporting per-person BVH motion-capture files ready to use in Blender or other 3D software.

What language is sam3dbody-cpp written in?

Mainly C. The stack also includes C, C++, ONNX Runtime.

What license does sam3dbody-cpp use?

The README does not state a license directly, check the repository for a license file before use.

How hard is sam3dbody-cpp to set up?

Setup difficulty is rated hard, with roughly 1day+ to a first successful run.

Who is sam3dbody-cpp for?

Mainly researcher.

Open on GitHub → Explain another repo

This repo across BitVibe Labs

Scan in gitsafehub Deploy in gitdeployhub ammarkov on gitmyhub

Verify against the repo before relying on details.