explaingit

cvat-ai/cvat

15,837PythonAudience · dataComplexity · 3/5Setup · moderate

TLDR

A platform for drawing labels on images and videos to create training data for computer vision AI models, available as a free hosted service, a paid cloud plan, or a self-hosted Docker deployment.

Mindmap

mindmap
  root((repo))
    Annotation types
      Images and video
      3D point clouds
      Instance segmentation
    Deployment options
      Free hosted tier
      Cloud subscription
      Self-hosted Docker
    Developer tools
      Python SDK
      CLI tool
      REST API
    Audience
      Data teams
      ML researchers
Click or tap to explore — scroll the page freely

Code map

Detail Auto

An interactive map of this repo's files and how they connect — its source is parsed live in your browser. Click Visualize to build it.

filefunction / class

Things people build with this

USE CASE 1

Label a dataset of traffic camera images with bounding boxes around cars and pedestrians to train an object detection model.

USE CASE 2

Set up a self-hosted CVAT instance with Docker so your team can annotate proprietary images without sending data to a third party.

USE CASE 3

Use the cvat-sdk Python library to automate annotation import and export within an existing ML pipeline.

USE CASE 4

Annotate 3D point cloud data or video frame-by-frame for instance segmentation in a computer vision research project.

Tech stack

PythonDockerDatumaro

Getting it running

Difficulty · moderate Time to first run · 30min

Self-hosting requires Docker and Docker Compose, the free hosted tier at cvat.ai supports up to 10 tasks and 500 MB of data.

In plain English

CVAT, which stands for Computer Vision Annotation Tool, is a platform for labeling images and videos so they can be used to train computer-vision models. When someone is building an AI that needs to recognise objects in pictures (say, cars in traffic footage or defects on a production line), the model first needs thousands of examples where a human has drawn boxes around the objects and tagged them. CVAT is the workspace where that drawing and tagging happens. It is offered in three ways. You can use the free hosted version at cvat.ai, where you can create up to ten annotation tasks and upload up to 500 MB of data. You can pay for a cloud subscription that lifts those limits and unlocks features like auto-annotation and integrations with Roboflow and HuggingFace. Or you can self-host the tool on your own servers using prebuilt Docker images (server and UI), which the README says have been downloaded more than a million times. The platform supports image, video, and 3D annotation, and it can import and export many common dataset formats so the labels work with other tools in the machine-learning pipeline. A related project called Datumaro is included for transforming datasets further. CVAT also exposes a server API, a Python SDK installable with pip install cvat-sdk, and a command-line tool installable with pip install cvat-cli, so teams can automate work or plug CVAT into their own scripts. You would actually use CVAT when you have raw images or videos and need labeled training data, whether you are a researcher, a startup training a model, or an enterprise running a labeling team. The codebase is primarily Python. The full README is longer than what was provided.

Copy-paste prompts

Prompt 1
Help me deploy CVAT on my server using Docker Compose so my team can annotate images on our own infrastructure without using the hosted service.
Prompt 2
Using the cvat-sdk Python library, write a script to upload a folder of images to a CVAT project, export the annotations as COCO JSON, and save them locally.
Prompt 3
I need to convert my CVAT-exported annotations into YOLO format for training a detection model. Help me use Datumaro to transform the dataset.
Prompt 4
Set up auto-annotation in CVAT using a pre-trained model so the tool suggests bounding boxes that my team only needs to verify and correct.
Open on GitHub → Explain another repo

← cvat-ai on gitmyhub — every repo by this author, as a profile.

Verify against the repo before relying on details.