apple/ml-velox

Analysis updated 2026-06-24

★ 13Audience · researcherComplexity · 4/5Setup · hard

Mindmap

mindmap
  root((ml-velox))
    Inputs
      Spacetime color point cloud
    Outputs
      Dynamic tokens
      4D surface
      3D Gaussians
    Use Cases
      Video to 4D generation
      3D tracking over time
      Image to 4D cloth simulation
    Tech Stack
      Python
      PyTorch
      3DGaussians

mindmap root((ml-velox)) Inputs Spacetime color point cloud Outputs Dynamic tokens 4D surface 3D Gaussians Use Cases Video to 4D generation 3D tracking over time Image to 4D cloth simulation Tech Stack Python PyTorch 3DGaussians

Click or tap to explore — scroll the page freely

What do people build with it?

USE CASE 1

Read the paper and cite Velox for 4D representation learning work

USE CASE 2

Watch the video results on the project site to evaluate the method

USE CASE 3

Reference the encoder plus dual-decoder design for your own 4D token research

What is it built with?

PythonPyTorch3DGaussians

How does it compare?

	apple/ml-velox	09catho/axon	0x1-1/revival
Stars	13	13	13
Language	—	JavaScript	C++
Setup difficulty	hard	moderate	hard
Complexity	4/5	4/5	5/5
Audience	researcher	researcher	developer

Figures from each repo's GitHub metadata at analysis time.

How do you get it running?

Difficulty · hard Time to first run · 1day+

README is a paper landing page with no install steps or code overview, so running the method requires waiting for or assembling the codebase yourself.

Released under a custom Apple license file in the repo, with a separate license for the sample data. Check LICENSE before use.

In plain English

This repository is the official companion page from Apple for a research paper called Velox: Learning Representations of 4D Geometry and Appearance, which is being presented at CVPR 2026. CVPR is the main yearly conference for computer vision research. The README is short and acts mostly as a pointer to the paper on arXiv and the project website hosted on Apple's GitHub pages. The research deals with 4D objects. In this context, 4D means a 3D object plus time, so think of a moving, deforming shape rather than a still statue. The authors describe a method that takes a messy moving point cloud, which is just a swarm of colored 3D points captured over time, and learns to compress it into a small set of tokens that still carry the shape and the look of the object. The team frames three goals for these tokens: they should be descriptive enough to recreate geometry and color, compact enough to be efficient for later use, and easy to build from sparse input. The README's abstract explains the training setup at a high level. A single encoder turns the spacetime point cloud into the tokens. Two decoders then read those tokens during training. One decoder reconstructs the time varying surface of the object, which teaches the tokens to capture shape. The other decoder produces 3D Gaussians, a popular way to render scenes today, which teaches the tokens to capture appearance and color. To show the tokens are useful in practice, the paper applies them to three downstream tasks. Those are turning a video into a 4D model, tracking a 3D scene over time, and an image to 4D pipeline used for cloth simulation. The README says the authors see strong results on all three and points readers to the project website for video examples. The rest of the README is housekeeping. It lists the repository license and a separate license for the sample data, credits other open source projects used in the codebase, and provides the BibTeX citation. There are no install instructions, no code overview, and no usage examples in the README itself.

Copy-paste prompts

Prompt 1

Summarize the Velox CVPR 2026 paper architecture in terms of encoder, 4D surface decoder, and Gaussian decoder

Prompt 2

Compare Velox tokens to 4D Gaussian Splatting representations for video-to-4D generation

Prompt 3

Outline how to reimplement the Velox encoder for spacetime color point clouds in PyTorch

Prompt 4

Explain how the cloth simulation downstream task uses image-to-4D from Velox tokens

Frequently asked questions

What is ml-velox?

Apple research project page for Velox, a CVPR 2026 paper on learning compact tokens for 4D objects from spacetime point clouds. README links to arXiv and the project site.

What license does ml-velox use?

Released under a custom Apple license file in the repo, with a separate license for the sample data. Check LICENSE before use.

How hard is ml-velox to set up?

Setup difficulty is rated hard, with roughly 1day+ to a first successful run.

Who is ml-velox for?

Mainly researcher.

Open on GitHub → Explain another repo

This repo across BitVibe Labs

Verify against the repo before relying on details.