Generate thousands of labeled synthetic outdoor scenes with depth maps and segmentation masks to train a computer vision model without hiring photographers.
Create realistic indoor room datasets with furniture for training robot navigation or object detection models.
Produce articulated 3D objects like doors and drawers for use in physics simulators for robotics research.
Build large-scale annotated training datasets at near-zero marginal cost using procedural generation instead of manual annotation.
Requires a specific Blender version and a set of Python dependencies, GPU is recommended for rendering at useful scale.
Infinigen is a research project from Princeton University that generates photorealistic 3D worlds, rooms, and objects entirely through code, without using hand-crafted assets or real-world footage. The system builds everything from scratch using procedural generation, which means it follows mathematical rules to create trees, terrain, furniture, and interiors that look realistic but are entirely synthetic. The project covers three major areas. Infinigen-Nature creates outdoor environments like forests, mountains, and waterways. Infinigen-Indoors builds detailed room scenes with furniture and realistic lighting. Infinigen-Articulated generates objects with moving parts, such as doors or drawers, that can be dropped into physics simulators for robotics research. The main audience is researchers in computer vision and robotics who need large amounts of realistic training data without hiring photographers or 3D artists. The system can produce thousands of unique scenes, each with built-in labels like depth maps, surface normals, and object segmentation masks, which are exactly the annotations that machine learning models need. Getting that data from real photos would be expensive, Infinigen produces it at nearly zero marginal cost per scene. The codebase is Python-based and depends on Blender, a free 3D tool, to handle the actual rendering. Setup involves installing Python dependencies and then running command-line scripts that output rendered images along with their annotations. Three quickstart guides cover the three main areas, and a set of documentation pages explains cameras, materials, fluid simulations, and export to formats like OBJ or OpenUSD. The project is backed by academic papers published at major computer vision conferences in 2023 and 2024, with a third paper on articulated assets published in 2025. It maintains a public roadmap and accepts outside contributions.
← princeton-vl on gitmyhub — every repo by this author, as a profile.
Verify against the repo before relying on details.