Analysis updated 2026-05-18
Reconstruct a 3D mesh of an indoor room from a set of photos taken from multiple angles for use in a 3D visualization or virtual walkthrough.
Use smartphone video of an interior space, compute camera positions with COLMAP, and run GenRecon to produce an editable 3D mesh with surface materials.
Evaluate GenRecon on ScanNet++ benchmark scenes to compare its reconstruction quality against other methods for a research paper.
Fine-tune the pretrained GenRecon models on a custom indoor dataset to improve reconstruction quality for a specific type of space.
| kasothaphie/genrecon | pluviobyte/video-production-skills | tianhangzhuzth/fundamental-ava | |
|---|---|---|---|
| Stars | 478 | 503 | 521 |
| Language | Python | Python | Python |
| Setup difficulty | hard | easy | moderate |
| Complexity | 5/5 | 2/5 | 4/5 |
| Audience | researcher | developer | researcher |
Figures from each repo's GitHub metadata at analysis time.
Requires a CUDA-capable GPU, CUDA 12 toolkit, running a multi-step setup.sh, and downloading pretrained checkpoint files before inference is possible.
GenRecon is a research system from the Technical University of Munich that reconstructs detailed three-dimensional models of indoor rooms from a set of ordinary photographs. Given multiple photos taken from different angles around a room, the system produces a complete 3D mesh with surface materials, not just a point cloud or rough shape. The technical approach is unusual in that it uses a generative model, a type of AI that has learned what indoor spaces generally look like from large datasets, as a guide during the reconstruction process. Most reconstruction methods work purely from the input photos. GenRecon also conditions on those photos but additionally draws on the generative model's knowledge to fill in areas that are unclear or partially obscured in the photographs. The system divides a large scene into overlapping sections, reconstructs each one, and assembles them into a coherent whole. The outputs are mesh files with physically-based rendering materials, meaning the geometry and surface appearance are represented in a format that game engines and professional 3D software can use directly. The paper accompanying this code reports that the system outperforms other reconstruction methods on standard benchmarks by about 16 percent. Setup requires a CUDA-capable GPU, the CUDA toolkit, and running a setup script that installs Python dependencies including PyTorch and several compiled extensions. Pretrained weights for the three neural network components involved in the pipeline are available for download. Training the models from scratch requires preparing large indoor scene datasets and running three separate training stages. The codebase is research code released alongside an academic paper published in May 2026. It is designed for researchers and engineers working on 3D reconstruction, computer vision, or applications that need realistic 3D scans of interior spaces from smartphone video or structured photo captures. The full README is longer than what was shown.
A research system that reconstructs detailed 3D indoor scene meshes with surface materials from multiple photos, using a generative AI prior to fill in geometry beyond what the photos directly show.
Mainly Python. The stack also includes Python, PyTorch, CUDA.
The README does not state a license, check the repository for a license file before use.
Setup difficulty is rated hard, with roughly 1day+ to a first successful run.
Mainly researcher.
This repo across BitVibe Labs
Verify against the repo before relying on details.