explaingit

icetea-cv/mv-roma

17PythonAudience · researcherComplexity · 4/5Setup · hard

TLDR

A Python research library that finds matching points across multiple photos simultaneously, keeping correspondences consistent across the whole set, to enable cleaner 3D reconstruction from ordinary images.

Mindmap

mindmap
  root((mv-roma))
    What it does
      Multi-view point matching
      3D reconstruction support
      CVPR 2026 research
    How it works
      One source many targets
      Pixel-level correspondence
      Confidence score per match
    Pretrained Models
      Outdoor MegaDepth weights
      Indoor scene weights
    Setup
      Python 3.10+ required
      PyTorch and NVIDIA GPU
      UFM dependency
Click or tap to explore — scroll the page freely

Code map

Detail Auto

An interactive map of this repo's files and how they connect — its source is parsed live in your browser. Click Visualize to build it.

filefunction / class

Things people build with this

USE CASE 1

Match points across 10+ photos of a scene simultaneously to get cleaner, globally consistent correspondences for 3D reconstruction.

USE CASE 2

Use pre-trained outdoor or indoor models to get pixel-level matches with confidence scores without training from scratch.

USE CASE 3

Replace pairwise feature matching in an existing 3D reconstruction pipeline with MV-RoMa's multi-view consistent approach.

USE CASE 4

Test point matching quality on your own images using the included demo script right after setup.

Tech stack

PythonPyTorchCUDA

Getting it running

Difficulty · hard Time to first run · 1h+

Requires an NVIDIA GPU, PyTorch, and the UFM library installed as a separate dependency before the included demo script will run.

License terms are not mentioned in the repository description.

In plain English

MV-RoMa is a Python library and research project from a group of computer vision researchers, presented at a major academic conference called CVPR in 2026. The goal is to find matching points between photographs, which is a core step in building 3D models from ordinary images. When you take several photos of the same object or scene from different angles, software can reconstruct a 3D model by figuring out which spot in one photo corresponds to which spot in another. Most existing tools compare two photos at a time. MV-RoMa does this with multiple photos simultaneously, keeping matches consistent across the whole set rather than treating each pair independently. The result is cleaner point tracks, meaning a single real-world location can be reliably followed across many photos. The library comes with pre-trained neural network weights for outdoor scenes (trained on a dataset called MegaDepth) and for indoor scenes. You give the model one source image and several target images, and it returns a map showing where each pixel in the source lands in each target, along with a confidence score for each prediction. Running the project requires a computer with a compatible NVIDIA GPU, Python 3.10 or later, and the PyTorch deep learning framework. Setup involves installing several dependencies including a separate library called UFM. A demo script is included so you can test the model on your own images right after setup. This is a research tool intended for computer vision engineers and researchers working on 3D reconstruction pipelines. It is not a consumer product, and using it effectively requires familiarity with deep learning and image processing concepts.

Copy-paste prompts

Prompt 1
I have 10 photos of a building from different angles. Show me how to use MV-RoMa to match points across all 10 images at once using the MegaDepth outdoor pretrained model.
Prompt 2
Write a Python script that loads a source image and 4 target images with MV-RoMa, runs the model, and visualizes the top matches colored by confidence score.
Prompt 3
How do I install MV-RoMa and its UFM dependency on a machine with an NVIDIA GPU, Python 3.10, and PyTorch 2.x?
Prompt 4
Show me how to take MV-RoMa's pixel correspondence maps and confidence scores and feed them into a structure-from-motion pipeline as feature matches.
Open on GitHub → Explain another repo

← icetea-cv on gitmyhub — every repo by this author, as a profile.

Verify against the repo before relying on details.