explaingit

microsoft/trellis.2

6,802PythonAudience · designerComplexity · 5/5LicenseSetup · hard

TLDR

A Microsoft research tool that converts a single photograph into a detailed 3D model with textures you can use in game engines or 3D software, powered by a 4-billion-parameter AI model.

Mindmap

mindmap
  root((TRELLIS.2))
    What it does
      Image to 3D model
      Texture generation
      GLB file export
    Tech Stack
      Python
      PyTorch
      CUDA
      Hugging Face
    Use Cases
      Game asset creation
      Product visualization
      3D texture generation
    Audience
      Game developers
      3D artists
      Researchers
Click or tap to explore — scroll the page freely

Code map

Detail Auto

An interactive map of this repo's files and how they connect — its source is parsed live in your browser. Click Visualize to build it.

filefunction / class

Things people build with this

USE CASE 1

Convert a product photo into a 3D model for use in a game engine or e-commerce 3D viewer

USE CASE 2

Generate textures for an existing 3D shape by providing a reference image

USE CASE 3

Create 3D props for game development from simple reference photographs

USE CASE 4

Try image-to-3D generation without a GPU using the live Hugging Face Spaces demo

Tech stack

PythonPyTorchCUDA

Getting it running

Difficulty · hard Time to first run · 1h+

Requires an NVIDIA GPU with at least 24GB of VRAM and a custom package install script that takes significant time to complete.

Use freely for any purpose, including commercial use, as long as you include the original MIT copyright notice.

In plain English

TRELLIS.2 is a research project from Microsoft that turns a single photograph into a detailed three-dimensional model. You give it an image, and it outputs a 3D asset complete with textures, color, roughness, and transparency information that you could use in a game engine, 3D editor, or rendering software. The model is large, with 4 billion parameters. It uses a new internal representation called O-Voxel, which is a way of storing 3D shapes that can handle tricky geometry like thin surfaces, holes, and objects with things inside them. Standard approaches in this area often struggle with those cases. The generation process takes around 3 to 60 seconds depending on the resolution you want, running on high-end NVIDIA graphics hardware. In addition to creating 3D shapes from images, the system can also generate textures for an existing 3D shape you provide. The output can be exported as a GLB file, which is a standard format for 3D objects that many applications can open. The project is research code intended for experimentation. It runs on Linux and requires an NVIDIA GPU with at least 24GB of memory. Setup involves installing a collection of specialized packages through a provided script, which the readme notes can take a while. A pretrained 4-billion-parameter model is available on Hugging Face, and there is also a live demo on Hugging Face Spaces where you can try it without setting anything up locally. The code and model weights are released under the MIT license.

Copy-paste prompts

Prompt 1
I have a photo of a shoe and want to create a 3D model using TRELLIS.2. Walk me through the setup on Linux with an NVIDIA GPU and generate the GLB file.
Prompt 2
Use TRELLIS.2 to generate a textured 3D model from a product photo and export it as GLB for use in a Three.js web viewer.
Prompt 3
I want to test TRELLIS.2 without a GPU. Show me how to use the Hugging Face Spaces demo and what image types give the best 3D reconstruction.
Prompt 4
Help me batch-process product images through TRELLIS.2 to generate 3D models and automate the export to GLB format.
Open on GitHub → Explain another repo

← microsoft on gitmyhub — every repo by this author, as a profile.

Verify against the repo before relying on details.