explaingit

facebookresearch/deit

Analysis updated 2026-07-03 · repo last pushed 2024-03-15

4,349PythonAudience · researcherComplexity · 3/5DormantLicenseSetup · moderate

TLDR

Pre-trained image recognition models from Meta that work well with limited training data, using transformer-based architectures instead of older convolutional networks.

Mindmap

mindmap
  root((deit))
    What it does
      Image classification
      Data-efficient training
      Pre-trained models
    Architectures
      DeiT transformers
      CaiT deeper models
      ResMLP feedforward
    Use Cases
      Fine-tune on small datasets
      Research benchmarking
      Production deployment
    Tech Stack
      Python
      PyTorch
      Vision Transformers
    Audience
      ML researchers
      ML engineers
Click or tap to explore — scroll the page freely

Code map

Detail Auto

An interactive map of this repo's files and how they connect — its source is parsed live in your browser. Click Visualize to build it.

filefunction / class

What do people build with it?

USE CASE 1

Fine-tune a pre-trained image classifier on your own product photos when you only have a few thousand labeled examples.

USE CASE 2

Compare vision transformer architectures against traditional convolutional networks for a research project.

USE CASE 3

Download and deploy a ready-made image recognition model without training from scratch.

USE CASE 4

Study and reproduce state-of-the-art data-efficient image classification research from Meta.

What is it built with?

PythonPyTorchVision Transformers

How does it compare?

facebookresearch/deitstructuredlabs/preswaldfacebookresearch/vjepa2
Stars4,3494,2904,235
LanguagePythonPythonPython
Last pushed2024-03-152026-03-23
MaintenanceDormantMaintained
Setup difficultymoderateeasyhard
Complexity3/52/54/5
Audienceresearcherdataresearcher

Figures from each repo's GitHub metadata at analysis time.

How do you get it running?

Difficulty · moderate Time to first run · 30min

Requires a compatible GPU and PyTorch installation, pretrained model weights must be downloaded separately.

Apache 2.0, use freely for any purpose, including commercial, as long as you keep the license notice.

In plain English

This repository contains practical implementations and trained models for several modern approaches to image recognition, the task of teaching computers to identify what's in a picture. Rather than requiring massive amounts of labeled training data like older methods did, these approaches are designed to work well even when you have limited data available. At a high level, the repository gives you working code and pre-trained models based on research papers published by Meta (formerly Facebook) researchers between 2021 and 2023. The core innovation across these projects is finding smarter ways to train image recognition systems so they need less data and compute time. Some approaches swap traditional convolutional networks (the older standard) for transformer-based architectures (originally developed for language), while others mix transformer concepts into convolutional designs. The README lists seven different research projects, each with its own folder and documentation, covering various experimental directions, like DeiT (the original data-efficient transformer approach), CaiT (making transformers deeper), ResMLP (using simpler feedforward networks), and others. The intended users are researchers, machine learning engineers, and practitioners who want to either build image classification systems or study how these newer architectures work. If you're training a model to classify product photos but only have a few thousand labeled examples (instead of millions), or if you want to understand how vision transformers compare to traditional convolutional networks, you'd use this repository. It provides the training scripts you need to fine-tune these models on your own data, as well as already-trained models you can download and use immediately. A practical aspect worth noting: because these are research implementations from a top institution, they're well-documented and actively maintained. The code comes with clear instructions in separate README files for each project, pre-trained model weights you can download, and the underlying academic papers so you can understand the theory behind each approach. This makes it useful both for practitioners who just want to use the models and for researchers who want to dive into the implementation details.

Copy-paste prompts

Prompt 1
Using the DeiT repo from facebookresearch, write Python code to load a pretrained DeiT model and classify a local image file.
Prompt 2
Show me how to fine-tune a DeiT model from facebookresearch/deit on a custom dataset with only 1000 labeled images per class.
Prompt 3
Explain the difference between DeiT, CaiT, and ResMLP architectures in facebookresearch/deit and when I should pick each one.
Prompt 4
Write a script using facebookresearch/deit to evaluate a pretrained model on my own validation set and report top-1 accuracy.

Frequently asked questions

What is deit?

Pre-trained image recognition models from Meta that work well with limited training data, using transformer-based architectures instead of older convolutional networks.

What language is deit written in?

Mainly Python. The stack also includes Python, PyTorch, Vision Transformers.

Is deit actively maintained?

Dormant — no commits in 2+ years (last push 2024-03-15).

What license does deit use?

Apache 2.0, use freely for any purpose, including commercial, as long as you keep the license notice.

How hard is deit to set up?

Setup difficulty is rated moderate, with roughly 30min to a first successful run.

Who is deit for?

Mainly researcher.

Open on GitHub → Explain another repo

This repo across BitVibe Labs

Scan in gitsafehub Deploy in gitdeployhub facebookresearch on gitmyhub

Verify against the repo before relying on details.