tencentarc/photomaker

★ 10,109Jupyter NotebookAudience · researcherComplexity · 4/5Setup · hard

Mindmap

mindmap
  root((PhotoMaker))
    What it does
      Identity-preserving generation
      No training needed
      Text-guided scenes
    Versions
      V1 realistic output
      V2 improved accuracy
    Integrations
      ControlNet
      LoRA modules
      ComfyUI
    Requirements
      11GB GPU VRAM
      Python and PyTorch
      Hugging Face weights

mindmap root((PhotoMaker)) What it does Identity-preserving generation No training needed Text-guided scenes Versions V1 realistic output V2 improved accuracy Integrations ControlNet LoRA modules ComfyUI Requirements 11GB GPU VRAM Python and PyTorch Hugging Face weights

Click or tap to explore — scroll the page freely

Things people build with this

USE CASE 1

Generate realistic photos of a specific person in different settings using just a handful of reference snapshots.

USE CASE 2

Create illustrated or artistic portraits of a real person by combining PhotoMaker with a style LoRA module.

USE CASE 3

Build a custom portrait generation pipeline by adding PhotoMaker as an adapter on top of an existing diffusers workflow.

USE CASE 4

Run experiments in ComfyUI or Replicate using community-built PhotoMaker nodes without writing Python code.

Tech stack

PythonPyTorchJupyter NotebookStable Diffusion XL

Getting it running

Difficulty · hard Time to first run · 1h+

Requires a GPU with at least 11 GB of VRAM, Python 3.8+, and PyTorch 2.0+, model weights download automatically from Hugging Face on first run.

In plain English

PhotoMaker is a research project from Tencent that generates new photos of a specific real person by learning what they look like from a small set of reference images. You give it one or more photos of someone along with a text description of the scene you want, and it produces new images showing that person in the described setting or style. The person's facial identity stays consistent across the generated images without any lengthy training step, which is the central claim of the work. The paper was presented at CVPR 2024. The system is built on top of Stable Diffusion XL, a popular open-source AI image generation model. PhotoMaker adds an adapter layer that encodes the identity from the reference photos and injects it into the generation process using what the authors call stacked ID embedding. Two versions exist: V1 for realistic-looking output, and V2 (supported by Tencent's HunyuanDiT team) with improved accuracy in preserving facial details. The tool can also be combined with other customization add-ons called LoRA modules, and it supports additional control tools like ControlNet and T2I-Adapter. Running PhotoMaker requires a GPU with at least 11 GB of memory, Python 3.8 or higher, and PyTorch 2.0 or higher. Installation is done through pip, and the model weights download automatically from Hugging Face the first time you use it. The code integrates with the standard diffusers Python library, so developers already familiar with that workflow can add PhotoMaker as an adapter without rebuilding a pipeline from scratch. Live demos are available on Hugging Face Spaces for both the realistic and stylization modes, and community members have built implementations for ComfyUI, Replicate, and Windows environments. The stylization mode produces illustrated or artistic renderings of the same person, achieved by swapping the base model and enabling LoRA modules for style.

Copy-paste prompts

Prompt 1

I have 3 reference photos of a person and want to generate new images of them using PhotoMaker. Walk me through the Python setup and a basic generation call.

Prompt 2

Help me combine PhotoMaker V2 with a ControlNet pose to generate an image of a specific person in a specific body position.

Prompt 3

I want to use PhotoMaker in stylization mode to create illustrated portraits. What LoRA and base model do I need?

Prompt 4

Show me how to add PhotoMaker as an adapter to an existing diffusers pipeline I already use for image generation.

Prompt 5

I want to run PhotoMaker in a ComfyUI workflow. What nodes do I need and how do I pass in reference images?

Open on GitHub → Explain another repo

← tencentarc on gitmyhub — every repo by this author, as a profile.

Verify against the repo before relying on details.