explaingit

yxuanar/code-as-room

57Python

TLDR

Code-as-Room is research code attached to an academic paper from teams at the Shanghai Artificial Intelligence Laboratory, the Shanghai Innovation Institute, Southern University of Science and Technology, and the University of Warwick.

Mindmap

A visual breakdown will appear here once this repo is fully enriched.

In plain English

Code-as-Room is research code attached to an academic paper from teams at the Shanghai Artificial Intelligence Laboratory, the Shanghai Innovation Institute, Southern University of Science and Technology, and the University of Warwick. The goal is to turn a single top-down picture of a room into a working 3D scene you can open in Blender, the free 3D modeling program. Instead of generating the scene as raw 3D data, the system writes Blender Python code that builds the room when run. The approach is described as agentic. Large language models and vision-language models look at the image and produce structured information about it: what kind of room it is, what objects are in it, how they relate spatially, descriptions of major furniture, and so on. Deterministic code then orchestrates these stages, validates outputs, repairs problems, manages memory, and stitches the pieces together. The pipeline goes through thirteen stages, numbered 0 through 12, covering scene classification, semantic and graph analysis, base Blender code, walls and minor placeholders, major object descriptions and geometry, surface-based placement of small objects, optional detailed small-object work, per-part PBR materials, real texture generation, and final lighting and render settings. This release focuses on the code-generation pipeline. The README is honest about what is not yet included. Asset-retrieval data and checkpoints, a planned web-based editor for the generated scenes, support for more diverse room shapes, whole-floor-plan handling for multi-room layouts, and a full benchmark are listed as future releases. The current pipeline works best on rectangular or near-rectangular rooms. Running it needs Python 3.10 or higher, Blender 3.6 or 4.x, and an OpenAI-compatible chat and vision API endpoint for the text and image stages. A separate image-generation endpoint is optional and only used for the texture stage. Python dependencies are a small set: langchain-openai, langchain-core, openai, pillow, and requests. Configuration goes through environment variables or a JSON config file, with separate model, base URL, and API key entries for the main and texture endpoints. A sample command runs the full pipeline against an example image in the repository. Outputs go into a timestamped run folder next to the input. The repo also includes an image-prompt workflow that generates top-down room prompts and can call an image-generation endpoint, useful for producing the input images the main pipeline expects. The code is released under Apache 2.0.

Open on GitHub → Explain another repo

Generated 2026-05-21 · Model: sonnet-4-6 · Verify against the repo before relying on details.