lllyasviel/controlnet

Analysis updated 2026-06-20

★ 33,858PythonAudience · researcherComplexity · 4/5Setup · hard

Mindmap

mindmap
  root((ControlNet))
    What it does
      Visual image control
      Pose conditioning
      Edge and depth guidance
    How it works
      Locked base model
      Trainable copy
      Zero convolution layers
    Condition types
      Body pose
      Sketch edges
      Depth maps
      Scribbles
    Tech
      Python
      Stable Diffusion
      Gradio
      PyTorch

mindmap root((ControlNet)) What it does Visual image control Pose conditioning Edge and depth guidance How it works Locked base model Trainable copy Zero convolution layers Condition types Body pose Sketch edges Depth maps Scribbles Tech Python Stable Diffusion Gradio PyTorch

Click or tap to explore — scroll the page freely

What do people build with it?

USE CASE 1

Generate a character illustration that exactly matches the body pose from a reference photo.

USE CASE 2

Turn a rough pencil sketch into a polished AI-generated image that preserves the sketch's composition and layout.

USE CASE 3

Re-render a scene in a different art style while keeping the depth structure of the original photo intact.

USE CASE 4

Produce consistent product placement across multiple AI-generated images using a depth map as a template.

What is it built with?

PythonPyTorchStable DiffusionGradio

How does it compare?

	lllyasviel/controlnet	pythagora-io/gpt-pilot	hkuds/cli-anything
Stars	33,858	33,770	33,734
Language	Python	Python	Python
Setup difficulty	hard	hard	easy
Complexity	4/5	4/5	3/5
Audience	researcher	developer	developer

Figures from each repo's GitHub metadata at analysis time.

How do you get it running?

Difficulty · hard Time to first run · 1h+

Requires a GPU with at least 4GB VRAM and a Stable Diffusion 1.5 model download of approximately 2GB.

License information is not mentioned in the explanation.

In plain English

ControlNet solves a real creative problem: when you use AI image generators like Stable Diffusion, you can describe what you want in text, but you have very little control over the exact composition, pose, or structure of the result. ControlNet adds a way to guide image generation using visual signals, things like edge outlines, human body poses, depth maps, or hand-drawn scribbles, so the AI generates images that follow your provided structure, not just your words. The way it works is clever: it makes a copy of part of the image-generation neural network. One copy is "locked" and stays unchanged (preserving the original model's capability), while the other copy is "trainable" and learns to respond to your extra visual condition. These two copies are connected through special "zero convolution" layers, small 1x1 filters initialized to output nothing at the start, which means the system begins training without causing any disruption to the original model. As training continues, these connectors gradually learn to inject the visual condition into the generation process. You would use ControlNet when you want to generate an image that matches a specific pose, follows the edges of a sketch you drew, mirrors the depth structure of a reference photo, or replicates the layout from a line drawing. Instead of prompting and hoping, you get reproducible control. The stack is Python, built on top of Stable Diffusion 1.5 (the popular open-source image model), and uses Gradio to provide interactive browser-based demos. Supporting tools include OpenPose for body detection, Midas for depth, and various edge-detection algorithms. Training can run on consumer GPUs with limited memory.

Copy-paste prompts

Prompt 1

Load ControlNet with an OpenPose condition and generate an image of a person in the exact pose shown in this reference photo.

Prompt 2

How do I use ControlNet's Canny edge detection model to generate an image that follows the outlines of my sketch?

Prompt 3

Set up the ControlNet Gradio demo locally so I can test pose, depth, and scribble conditions interactively in a browser.

Prompt 4

I want to generate product photos where the item always appears in the same position. How do I use a depth map condition with ControlNet?

Prompt 5

What consumer GPU specs do I need to run ControlNet locally, and can it run on a laptop with 8GB VRAM?

Frequently asked questions

What is controlnet?

ControlNet lets you guide AI image generation with visual inputs, body poses, sketch edges, depth maps, or scribbles, so you control the exact structure of the result, not just describe it in text.

What language is controlnet written in?

Mainly Python. The stack also includes Python, PyTorch, Stable Diffusion.

What license does controlnet use?

License information is not mentioned in the explanation.

How hard is controlnet to set up?

Setup difficulty is rated hard, with roughly 1h+ to a first successful run.

Who is controlnet for?

Mainly researcher.

Open on GitHub → Explain another repo

This repo across BitVibe Labs

Scan in gitsafehub Deploy in gitdeployhub lllyasviel on gitmyhub

Verify against the repo before relying on details.