facebookresearch/maskrcnn-benchmark

★ 9,377PythonAudience · researcherComplexity · 4/5Setup · hard

Mindmap

mindmap
  root((maskrcnn-benchmark))
    What it does
      Object detection
      Instance segmentation
      Webcam demo
    Tech
      PyTorch
      CUDA
      Multi-GPU training
    Data
      COCO dataset
      Pre-trained weights
    Status
      Deprecated
      Replaced by Detectron2

mindmap root((maskrcnn-benchmark)) What it does Object detection Instance segmentation Webcam demo Tech PyTorch CUDA Multi-GPU training Data COCO dataset Pre-trained weights Status Deprecated Replaced by Detectron2

Click or tap to explore — scroll the page freely

Things people build with this

USE CASE 1

Run object detection and instance segmentation on photos using pre-trained COCO weights without training from scratch.

USE CASE 2

Train a custom model across multiple GPUs using distributed training with mixed-precision for faster runs.

USE CASE 3

Use the webcam or Jupyter notebook demo to see bounding boxes and segmentation masks drawn in real time.

Tech stack

PythonPyTorchCUDACOCO

Getting it running

Difficulty · hard Time to first run · 1h+

Requires a CUDA-capable GPU and the COCO dataset download (~20GB) before training.

Open source for research purposes, specific license terms not stated in the explanation.

In plain English

This repository, created by Facebook Research, is a now-deprecated Python library for training and running computer vision models that can detect objects in images and draw precise outlines around them. The two main tasks it covers are object detection, which draws boxes around recognized items in a photo, and instance segmentation, which goes further by tracing the exact pixel boundary of each object rather than just a box. The README notes upfront that this project has been replaced by a newer Facebook Research library called Detectron2, which covers everything this one did. Anyone starting fresh is directed there instead. For those who encounter this older codebase, it was built on top of PyTorch and was notable at release for being faster and less memory-hungry than competing implementations. It could train across multiple GPUs at once and supported a mixed-precision mode that sped things up further on compatible hardware. Inference, the step of running the model on new images to get predictions, could also run on a regular CPU without a GPU. The library includes a webcam demo and a Jupyter notebook demo so you can see the model working in real time, outlining objects the camera sees. Training required downloading a large image dataset called COCO, which contains labeled examples for over 80 object categories. Pre-trained model weights were provided so you could run inference without training from scratch. Configuration was file-based: you picked a YAML config file specifying the model architecture and training settings, then passed it to a training script. Multi-GPU runs used PyTorch's built-in distributed training launcher. The project is open source, created for research purposes, and the code remains publicly available even though active development has moved to Detectron2.

Copy-paste prompts

Prompt 1

Using maskrcnn-benchmark, show me how to load a pre-trained COCO model and run inference on a folder of images, saving the output with bounding boxes drawn.

Prompt 2

I want to fine-tune maskrcnn-benchmark on my own labeled dataset. Walk me through the YAML config changes needed to point it at my data and adjust the number of object categories.

Prompt 3

Write a multi-GPU training launch command for maskrcnn-benchmark using PyTorch's distributed launcher with 4 GPUs and mixed-precision enabled.

Open on GitHub → Explain another repo

← facebookresearch on gitmyhub — every repo by this author, as a profile.

Verify against the repo before relying on details.