facebookresearch/slowfast

★ 7,358PythonAudience · researcherComplexity · 4/5Setup · hard

Mindmap

mindmap
  root((PySlowFast))
    What it does
      Video understanding
      Action recognition
      Object detection
    Architectures
      SlowFast network
      Vision transformers
      Self-supervised models
    Resources
      Model zoo
      Pre-trained weights
      Visualization tools
    Requirements
      Python PyTorch
      GPU hardware
      Video datasets

mindmap root((PySlowFast)) What it does Video understanding Action recognition Object detection Architectures SlowFast network Vision transformers Self-supervised models Resources Model zoo Pre-trained weights Visualization tools Requirements Python PyTorch GPU hardware Video datasets

Click or tap to explore — scroll the page freely

Things people build with this

USE CASE 1

Train a video action recognition model on your own dataset using PyTorch and pre-built configs.

USE CASE 2

Download pre-trained SlowFast weights from the model zoo and run inference on a video clip without training from scratch.

USE CASE 3

Fine-tune a vision transformer architecture for detecting actions in sports, surveillance, or robotics footage.

USE CASE 4

Reproduce published FAIR video understanding results for comparison in academic research.

Tech stack

PythonPyTorch

Getting it running

Difficulty · hard Time to first run · 1day+

Requires a GPU, PyTorch setup, and large video datasets, dataset preparation alone follows separate documentation files.

In plain English

PySlowFast is a research codebase from Facebook AI Research (FAIR) for training and evaluating models that understand what is happening in videos. Rather than analyzing still images, video understanding models look at sequences of frames to recognize actions, detect objects over time, and understand motion. The repository implements several research architectures for this task. The name comes from the SlowFast network, one of the included models, which processes video at two different frame rates simultaneously: a slow pathway captures spatial detail by looking at a small number of frames carefully, while a fast pathway scans more frames at lower resolution to capture motion. Other included architectures cover a range of approaches from convolutional networks to vision transformers adapted for video, including models that can learn from unlabeled video data. Pre-trained model weights for all the included architectures are available for download through the project's model zoo, so researchers can start from existing checkpoints rather than training from scratch. The repository also includes visualization tools to inspect model behavior during training and inference. The intended audience is machine learning researchers and engineers working on video analysis problems such as action recognition and object detection in video. Using the codebase requires familiarity with Python and PyTorch, and access to video datasets. Installation instructions and dataset preparation guides are provided in separate documentation files within the repository.

Copy-paste prompts

Prompt 1

How do I load a pre-trained SlowFast model from the PySlowFast model zoo and run it on a local video file?

Prompt 2

I want to fine-tune PySlowFast on my own action recognition dataset, walk me through data preparation and the config file.

Prompt 3

Explain the SlowFast slow and fast pathways, what frame rates should I set for each and why?

Prompt 4

How do I use the PySlowFast visualization tools to inspect which video frames my model is focusing on during inference?

Prompt 5

I want to cite the SlowFast paper in my research, what is the arXiv reference mentioned in the repo?

Open on GitHub → Explain another repo

← facebookresearch on gitmyhub — every repo by this author, as a profile.

Verify against the repo before relying on details.