ashawkey/stable-dreamfusion

★ 8,830PythonAudience · researcherComplexity · 5/5Setup · hard

Mindmap

mindmap
  root((stable-dreamfusion))
    What it does
      Text to 3D mesh
      Image to 3D
      Export mesh files
    How it works
      NeRF 3D representation
      Stable Diffusion guide
      Instant-NGP renderer
    Inputs
      Text prompt
      Reference image
    Outputs
      Rotating video
      3D mesh with texture
    Requirements
      NVIDIA GPU 16GB VRAM
      CUDA installation

mindmap root((stable-dreamfusion)) What it does Text to 3D mesh Image to 3D Export mesh files How it works NeRF 3D representation Stable Diffusion guide Instant-NGP renderer Inputs Text prompt Reference image Outputs Rotating video 3D mesh with texture Requirements NVIDIA GPU 16GB VRAM CUDA installation

Click or tap to explore — scroll the page freely

Things people build with this

USE CASE 1

Generate a 3D mesh of an object from a text prompt like 'a wooden chair' for use in 3D software or game assets.

USE CASE 2

Convert an existing photo into a rough 3D model using the image-to-3D mode with Zero-1-to-3.

USE CASE 3

Reproduce and experiment with DreamFusion-style text-to-3D generation research using publicly available Stable Diffusion.

Tech stack

PythonPyTorchCUDAStable DiffusionInstant-NGP

Getting it running

Difficulty · hard Time to first run · 1day+

Requires an NVIDIA GPU with at least 16 GB of VRAM and a working CUDA installation, optional CUDA extensions must be compiled from source.

In plain English

Stable-Dreamfusion is a Python tool for generating 3D objects from text descriptions or images. You type a prompt like "a hamburger" and the system produces a 3D model you can view and export as a mesh file. The project implements the DreamFusion research paper using Stable Diffusion, a publicly available image-generation model, as the guidance engine, since the model referenced in the original paper is not publicly released. The authors note upfront that this is a work in progress and results do not yet match the quality shown in the paper. The 3D generation works by repeatedly asking a 2D image model to evaluate and refine a 3D representation called a NeRF (neural radiance field). The NeRF stores a 3D scene as a mathematical function and renders it from any angle. Stable-Dreamfusion uses an accelerated NeRF variant called Instant-NGP that can render at around 10 frames per second on a GPU with 16 GB of memory. Once training is done, you can export the result as a standard 3D mesh with textures. You run training from the command line with a text prompt. After training finishes, you run the same script in test mode to export a video rotating around the object or to save the mesh file. The tool also supports image-to-3D: if you start from an existing photo, you can pass it in instead of a text prompt, though this requires downloading an additional pretrained model called Zero-1-to-3. A simple graphical interface is included as well. The main requirements are a recent NVIDIA GPU (tested on a V100 with 16 GB of VRAM), a working CUDA installation, and Python with PyTorch. Setup involves installing Python packages and optionally compiling custom CUDA extensions. Google Colab notebooks are provided so you can try it without a local GPU.

Copy-paste prompts

Prompt 1

Walk me through running stable-dreamfusion to generate a 3D mesh from the text prompt 'a red apple' on a local GPU.

Prompt 2

How do I set up the Zero-1-to-3 model with stable-dreamfusion to convert a single photo into a 3D object?

Prompt 3

What NVIDIA GPU and CUDA version does stable-dreamfusion require, and how do I verify my setup is compatible before starting?

Prompt 4

How do I run stable-dreamfusion on Google Colab so I can try text-to-3D generation without a local GPU?

Prompt 5

Explain how Instant-NGP and Stable Diffusion work together in the DreamFusion approach to produce a 3D mesh from a text prompt.

Open on GitHub → Explain another repo

← ashawkey on gitmyhub — every repo by this author, as a profile.

Verify against the repo before relying on details.