Edit photos with plain-English commands like 'make it look vintage' or 'add a sunset' without manual masking.
Run an interactive browser-based image editor using the included Gradio web app.
Download the 454,000-example training dataset of paired before-and-after edits to fine-tune your own model.
Requires a GPU with more than 18 GB of VRAM, there is no CPU or low-VRAM fallback.
InstructPix2Pix is a research project from UC Berkeley that lets you edit images by describing the change you want in plain English. You provide an input image and a text instruction like "turn him into a cyborg" or "add snow," and the model produces a new version of the image with that edit applied. It was published as an academic paper and this repository contains the code to run it and the data used to train it. The model is built on top of Stable Diffusion, a popular open-source image generation model. Fine-tuning Stable Diffusion on paired image examples, before and after an edit, taught the model to follow editing instructions while preserving the content of the original image that should remain unchanged. Running the model requires a GPU with more than 18 gigabytes of memory. You can edit a single image from the command line by passing in the image file and your instruction as text. There is also an interactive web application powered by Gradio that lets you upload images and type instructions in a browser interface. Parameters like the number of diffusion steps and guidance strength can be adjusted to tune the quality and faithfulness of the result. The training dataset consists of around 454,000 examples, each containing an original image, an editing instruction, and the edited result. The dataset was built in two stages: first, GPT-3 was fine-tuned to generate captions and matching edit instructions, and then Stable Diffusion combined with a technique called Prompt-to-Prompt converted those paired text captions into paired images. Two versions of the dataset are available for download: a full random-sample version and a higher-quality filtered version selected using CLIP scoring.
← timothybrooks on gitmyhub — every repo by this author, as a profile.
Verify against the repo before relying on details.