fudancvl/occlusionformer

Analysis updated 2026-06-24

★ 16PythonAudience · researcherComplexity · 4/5Setup · hard

Mindmap

mindmap
  root((OcclusionFormer))
    Inputs
      Layout JSON
      FLUX base model
      OcclusionFormer checkpoint
    Outputs
      Layered image
      Correct Z order
    Use Cases
      Layout to image research
      Occlusion ablations
      Demo via Streamlit
    Tech Stack
      Python
      PyTorch
      FLUX
      Streamlit
      Hugging Face

mindmap root((OcclusionFormer)) Inputs Layout JSON FLUX base model OcclusionFormer checkpoint Outputs Layered image Correct Z order Use Cases Layout to image research Occlusion ablations Demo via Streamlit Tech Stack Python PyTorch FLUX Streamlit Hugging Face

Click or tap to explore — scroll the page freely

What do people build with it?

USE CASE 1

Reproduce ICML 2026 results on layout-to-image with overlapping boxes

USE CASE 2

Compose a custom scene from a layout JSON with correct front-to-back order

USE CASE 3

Benchmark against the SA-Z dataset with amodal annotations

USE CASE 4

Try the Streamlit demo to explore Z-order conditioned generation

What is it built with?

PythonPyTorchFLUXStreamlit

How does it compare?

	fudancvl/occlusionformer	adya84/ha-world-cup-2026	afk-surf/safeclipper
Stars	16	16	16
Language	Python	Python	Python
Setup difficulty	hard	easy	moderate
Complexity	4/5	2/5	3/5
Audience	researcher	general	developer

Figures from each repo's GitHub metadata at analysis time.

How do you get it running?

Difficulty · hard Time to first run · 1day+

Needs Python 3.11 conda env, the FLUX base model weights, and the OcclusionFormer checkpoint from Hugging Face.

In plain English

OcclusionFormer is a research project from Fudan University that accompanies a paper accepted at the ICML 2026 machine learning conference. It tackles a specific problem in image generation: when you tell an AI model to draw a scene by giving it bounding boxes for each object, and those boxes overlap, current methods often blend the textures together or get the order wrong, so an object that should be behind ends up looking like it is in front. The authors propose handling the front to back order, which they call Z-order, as an explicit step in the model. The approach has three pieces according to the README: each object instance is generated separately, then composed using a method borrowed from volume rendering that decides how much each layer shows through, and finally a queried alignment step keeps each object in its correct spatial position. Alongside the model the team is releasing a dataset called SA-Z, which adds occlusion order and amodal annotations (information about parts of objects hidden behind other objects) to layout data. This repository is the inference and demo package. It contains the model code, a Streamlit web demo, a command line inference script, example layout JSON files, and a requirements file. The model weights and the SA-Z dataset are hosted on Hugging Face, and the paper itself is on arXiv. To run it, the README walks through creating a Python 3.11 conda environment, installing the requirements, downloading the checkpoint, and either starting the Streamlit demo or calling the CLI script with paths to a base FLUX model, the OcclusionFormer checkpoint, and a layout JSON. One open task remains: organizing the amodal annotations on Hugging Face.

Copy-paste prompts

Prompt 1

Walk me through setting up a Python 3.11 conda env for OcclusionFormer and downloading the FLUX base plus checkpoint

Prompt 2

Show me how to write a layout JSON with overlapping bounding boxes and run the CLI inference script

Prompt 3

Help me run the Streamlit demo and understand the volume rendering composition step

Prompt 4

Explain how SA-Z amodal annotations are used during training versus inference in OcclusionFormer

Frequently asked questions

What is occlusionformer?

Inference code and demo for the ICML 2026 OcclusionFormer paper, which composes overlapping objects with explicit Z-order on top of FLUX.

What language is occlusionformer written in?

Mainly Python. The stack also includes Python, PyTorch, FLUX.

How hard is occlusionformer to set up?

Setup difficulty is rated hard, with roughly 1day+ to a first successful run.

Who is occlusionformer for?

Mainly researcher.

Open on GitHub → Explain another repo

This repo across BitVibe Labs

Verify against the repo before relying on details.