Neuralangelo is a research project from NVIDIA that reconstructs detailed 3D surface models from ordinary video footage. Given a short video of an object or scene, the system figures out the 3D shape of what was filmed and produces a mesh file that can be used in 3D software, games, or visual effects. The name is a nod to Michelangelo, reflecting the goal of high-fidelity surface detail. It was presented at the Computer Vision and Pattern Recognition conference in 2023. The process has two main stages. First, the video frames are processed to estimate the position and angle of the camera for every frame. This step uses a separate tool called COLMAP, which analyzes how objects move across frames to deduce where the camera was. Neuralangelo then takes those estimated camera positions and the video frames together and trains a neural representation of the scene's geometry. Once training finishes, a second script extracts the surface as a mesh file. Running it requires a powerful NVIDIA GPU. The default configuration needs 24GB of GPU memory. The README includes a table showing which settings to dial down if you have a smaller GPU, with the trade-off being lower reconstruction detail. For custom video, good results depend on clean footage: minimal motion blur and a consistent focus range help COLMAP recover accurate camera poses, which directly affects the quality of the final surface. Setup is done either through Docker containers (two separate images, one for the data preprocessing step and one for the main training) or through a Conda environment file included in the repository. A Google Colab notebook is also available for trying the system without a local GPU. The code is built on NVIDIA's internal Imaginaire library. For commercial or research licensing, the README points to NVIDIA's research inquiry form rather than offering an open commercial license.
← nvlabs on gitmyhub — every repo by this author, as a profile.
Verify against the repo before relying on details.