explaingit

vt-vl-lab/3d-photo-inpainting

7,073PythonAudience · researcherComplexity · 4/5LicenseSetup · hard

TLDR

Turn any single flat photo into a short 3D video with realistic camera movement using AI depth estimation and scene filling, a research paper implementation from CVPR 2020.

Mindmap

mindmap
  root((3d-photo-inpainting))
    What it does
      Depth estimation
      Scene inpainting
      3D video generation
    Outputs
      Zoom and swing videos
      Dolly zoom effect
      3D mesh file
    Tech stack
      Python
      PyTorch
      MiDaS depth
      EdgeConnect
    Use cases
      Photo animation
      3D content creation
      Research reproduction
Click or tap to explore — scroll the page freely

Code map

Detail Auto

An interactive map of this repo's files and how they connect — its source is parsed live in your browser. Click Visualize to build it.

filefunction / class

Things people build with this

USE CASE 1

Turn a flat photo into a 3D cinematic video with zoom, swing, and dolly zoom camera movements

USE CASE 2

Generate a 3D mesh file from a 2D image for use in graphics or animation projects

USE CASE 3

Reproduce the CVPR 2020 3D photo inpainting paper results on your own images

USE CASE 4

Try the technique in-browser using the linked Google Colab notebook without installing anything

Tech stack

PythonPyTorchLinuxMiDaSEdgeConnect

Getting it running

Difficulty · hard Time to first run · 1h+

Requires Linux, a compatible Python and PyTorch version, and a setup script to download pretrained model weights.

Use freely for any purpose, including commercial use, as long as you keep the copyright notice.

In plain English

This repository contains the code from a research paper published at CVPR 2020 (a major computer vision conference). The project takes a single ordinary photo as input and converts it into a short 3D video where the camera appears to move through the scene, creating a sense of depth that the original flat image does not have. The underlying technique works in two steps. First, the code estimates how far away different parts of the image are from the camera, producing a depth map. Second, it fills in the parts of the scene that would have been hidden behind foreground objects from the original viewpoint. This fill-in step, called inpainting, is what lets the system render the scene from slightly different angles without leaving obvious holes. The result is a layered 3D representation that can be displayed using standard graphics tools. Once you run the code on a photo, it saves several video files showing different camera movements: zooming in, swinging side to side, moving in a circle, and a dolly zoom effect. If you want, it can also save a 3D mesh file of the scene. The whole process typically takes two to three minutes per image, depending on the machine. Setting up the project requires Linux, a compatible version of Python, and PyTorch. A setup script downloads the pretrained model weights. There is also a Google Colab notebook linked in the README for anyone who wants to try it in a browser without installing anything locally. The code is released under the MIT license. The README includes a citation block for the original paper and credits to several other open-source projects the code builds on, including MiDaS for depth estimation and EdgeConnect for image inpainting.

Copy-paste prompts

Prompt 1
Using 3D Photo Inpainting, I want to convert a portrait photo into a 3D video with a circular camera movement. Walk me through running the Python code step by step.
Prompt 2
Which Python version and PyTorch version does vt-vl-lab/3d-photo-inpainting require? I want to set up the environment on Ubuntu.
Prompt 3
Help me run the 3D-photo-inpainting setup script to download the pretrained model weights and verify everything is working.
Prompt 4
I ran 3D Photo Inpainting and want to export a 3D mesh file instead of video, how do I enable that output option?
Prompt 5
Help me open the 3D-photo-inpainting Colab notebook and run it on a photo I upload, without installing anything locally.
Open on GitHub → Explain another repo

← vt-vl-lab on gitmyhub — every repo by this author, as a profile.

Verify against the repo before relying on details.