explaingit

yxuanar/code-as-room

Analysis updated 2026-06-24

57PythonAudience · researcherComplexity · 4/5LicenseSetup · hard

TLDR

Research pipeline that turns a top-down room photo into a working Blender 3D scene by having LLMs and VLMs write Blender Python code through thirteen orchestrated stages.

Mindmap

mindmap
  root((Code-as-Room))
    Inputs
      Top-down room image
      OpenAI-compatible API
      Optional image-gen endpoint
    Outputs
      Blender Python script
      3D scene file
      PBR materials
      Render settings
    Use Cases
      Reproduce paper experiments
      Prototype scene generation
      Study agentic pipelines
    Tech Stack
      Python
      Blender
      LangChain
      OpenAI
Click or tap to explore — scroll the page freely

Code map

Detail Auto

An interactive map of this repo's files and how they connect — its source is parsed live in your browser. Click Visualize to build it.

filefunction / class

What do people build with it?

USE CASE 1

Reproduce the paper's image-to-3D-room pipeline on the included sample image

USE CASE 2

Generate Blender scenes from your own top-down room photos for research

USE CASE 3

Study how a thirteen-stage agentic pipeline orchestrates LLM and VLM calls

USE CASE 4

Use the image-prompt workflow to synthesize top-down room images as pipeline inputs

What is it built with?

PythonBlenderLangChainOpenAI

How does it compare?

yxuanar/code-as-roomjsingletonai/dejavucp-cp/liveedit
Stars575659
LanguagePythonPythonPython
Setup difficultyhardeasyhard
Complexity4/52/55/5
Audienceresearcherdeveloperresearcher

Figures from each repo's GitHub metadata at analysis time.

How do you get it running?

Difficulty · hard Time to first run · 1h+

Needs Blender 3.6 or 4.x plus an OpenAI-compatible chat and vision endpoint, and the pipeline only handles near-rectangular rooms today.

Apache 2.0, use and modify freely with attribution and a patent grant.

In plain English

Code-as-Room is research code attached to an academic paper from teams at the Shanghai Artificial Intelligence Laboratory, the Shanghai Innovation Institute, Southern University of Science and Technology, and the University of Warwick. The goal is to turn a single top-down picture of a room into a working 3D scene you can open in Blender, the free 3D modeling program. Instead of generating the scene as raw 3D data, the system writes Blender Python code that builds the room when run. The approach is described as agentic. Large language models and vision-language models look at the image and produce structured information about it: what kind of room it is, what objects are in it, how they relate spatially, descriptions of major furniture, and so on. Deterministic code then orchestrates these stages, validates outputs, repairs problems, manages memory, and stitches the pieces together. The pipeline goes through thirteen stages, numbered 0 through 12, covering scene classification, semantic and graph analysis, base Blender code, walls and minor placeholders, major object descriptions and geometry, surface-based placement of small objects, optional detailed small-object work, per-part PBR materials, real texture generation, and final lighting and render settings. This release focuses on the code-generation pipeline. The README is honest about what is not yet included. Asset-retrieval data and checkpoints, a planned web-based editor for the generated scenes, support for more diverse room shapes, whole-floor-plan handling for multi-room layouts, and a full benchmark are listed as future releases. The current pipeline works best on rectangular or near-rectangular rooms. Running it needs Python 3.10 or higher, Blender 3.6 or 4.x, and an OpenAI-compatible chat and vision API endpoint for the text and image stages. A separate image-generation endpoint is optional and only used for the texture stage. Python dependencies are a small set: langchain-openai, langchain-core, openai, pillow, and requests. Configuration goes through environment variables or a JSON config file, with separate model, base URL, and API key entries for the main and texture endpoints. A sample command runs the full pipeline against an example image in the repository. Outputs go into a timestamped run folder next to the input. The repo also includes an image-prompt workflow that generates top-down room prompts and can call an image-generation endpoint, useful for producing the input images the main pipeline expects. The code is released under Apache 2.0.

Copy-paste prompts

Prompt 1
Set up Code-as-Room with Python 3.10, Blender 4.x, and an OpenAI-compatible endpoint, then run the example image through the full thirteen-stage pipeline
Prompt 2
Walk me through each of the thirteen stages in Code-as-Room and what artifact each one writes to the run folder
Prompt 3
Swap Code-as-Room's main chat model to a local OpenAI-compatible server and check which stages still work
Prompt 4
Modify Code-as-Room to handle an L-shaped room instead of the rectangular default and document where the pipeline breaks
Prompt 5
Skip the texture-generation stage of Code-as-Room and use solid PBR materials only to cut API cost

Frequently asked questions

What is code-as-room?

Research pipeline that turns a top-down room photo into a working Blender 3D scene by having LLMs and VLMs write Blender Python code through thirteen orchestrated stages.

What language is code-as-room written in?

Mainly Python. The stack also includes Python, Blender, LangChain.

What license does code-as-room use?

Apache 2.0, use and modify freely with attribution and a patent grant.

How hard is code-as-room to set up?

Setup difficulty is rated hard, with roughly 1h+ to a first successful run.

Who is code-as-room for?

Mainly researcher.

Open on GitHub → Explain another repo

This repo across BitVibe Labs

Verify against the repo before relying on details.