stability-ai/generative-models

Analysis updated 2026-05-18

★ 27,136PythonAudience · developerComplexity · 4/5LicenseSetup · hard

Mindmap

mindmap
  root((repo))
    What it does
      Video to 3D views
      Image to 360 video
      Novel view synthesis
    Models included
      SV4D 2.0
      SV3D
      Stable Video Diffusion
    How to use
      Download from HuggingFace
      Run locally on GPU
      Python-based
    Use cases
      3D asset generation
      Creative tool building
      AI video research

mindmap root((repo)) What it does Video to 3D views Image to 360 video Novel view synthesis Models included SV4D 2.0 SV3D Stable Video Diffusion How to use Download from HuggingFace Run locally on GPU Python-based Use cases 3D asset generation Creative tool building AI video research

Click or tap to explore — scroll the page freely

What do people build with it?

USE CASE 1

Generate 3D object views from a single video by rendering the same object from multiple camera angles.

USE CASE 2

Create 360-degree orbital videos around objects captured in a single still image.

USE CASE 3

Build creative applications that turn 2D images or videos into multi-view 3D-like experiences.

What is it built with?

PythonPyTorchCUDAHuggingFaceDiffusion models

How does it compare?

	stability-ai/generative-models	sgl-project/sglang	huggingface/smolagents
Stars	27,136	27,141	27,114
Language	Python	Python	Python
Setup difficulty	hard	hard	moderate
Complexity	4/5	4/5	3/5
Audience	developer	developer	developer

Figures from each repo's GitHub metadata at analysis time.

How do you get it running?

Difficulty · hard Time to first run · 1h+

Requires CUDA-capable GPU, large model downloads, and PyTorch/CUDA environment setup.

Use freely for any purpose including commercial, as long as you keep the copyright notice.

In plain English

This repository, Generative Models by Stability AI, is the home for a series of research models that generate visual content from images and short videos. The README walks through releases by date. Stable Video 4D 2.0, or SV4D 2.0, is described as a video-to-4D diffusion model: it takes a short input video of a moving object and produces novel-view videos that look like the same scene filmed from other camera angles. The earlier Stable Video 4D and Stable Video 3D models are also documented, SV3D is described as an image-to-video model for generating multiple synthetic views from a single picture. These are diffusion models, the family of generative AI systems that produce images or videos by gradually refining noise into a coherent output guided by an input. The README gives practical numbers for SV4D 2.0: it generates 48 frames (12 video frames across 4 camera views) at 576-by-576 resolution from a 12-frame input, ideally clean white-background footage of a single moving object. Longer outputs are produced by running the model in steps and feeding earlier results back in. Sampling scripts accept a gif or mp4 file, a folder of frames, or a filename pattern, download weights from Hugging Face, and write generated frames to an output folder. Options cover sampling steps, camera elevation, background removal, and running on cards with less memory. Someone would use this repository for research in synthesizing new views of objects from limited footage, for example multi-view content generation or 4D asset creation. The README marks the releases as for research purposes. It is written in Python and uses PyTorch with CUDA.

Copy-paste prompts

Prompt 1

How do I set up SV4D 2.0 from this repo to convert a video of an object into multi-view 3D renders?

Prompt 2

Show me the code to download and run SV3D locally to generate a 360-degree video from a single image.

Prompt 3

What GPU memory do I need to run these Stability AI models, and how do I optimize for faster generation?

Frequently asked questions

What is generative-models?

AI models from Stability AI that generate 3D videos and multi-view imagery from single images or videos using diffusion techniques.

What language is generative-models written in?

Mainly Python. The stack also includes Python, PyTorch, CUDA.

What license does generative-models use?

Use freely for any purpose including commercial, as long as you keep the copyright notice.

How hard is generative-models to set up?

Setup difficulty is rated hard, with roughly 1h+ to a first successful run.

Who is generative-models for?

Mainly developer.

Open on GitHub → Explain another repo

This repo across BitVibe Labs

Scan in gitsafehub Deploy in gitdeployhub stability-ai on gitmyhub

Verify against the repo before relying on details.