explaingit

humansignal/label-studio

Analysis updated 2026-06-21

27,215TypeScriptAudience · dataComplexity · 3/5Setup · moderate

TLDR

Label Studio is an open-source web app where teams label images, text, audio, and video to create training data for machine learning models, then export annotations in formats model training code can read.

Mindmap

mindmap
  root((Label Studio))
    What it does
      Data labeling
      Annotation UI
      Export for training
    Data types
      Images
      Text
      Audio
      Video
      Time series
    How to run
      pip install
      Docker image
      Self-hosted
      Cloud hosted
    Audience
      Data teams
      ML engineers
      Annotators
Click or tap to explore — scroll the page freely

Code map

Detail Auto

An interactive map of this repo's files and how they connect — its source is parsed live in your browser. Click Visualize to build it.

filefunction / class

What do people build with it?

USE CASE 1

Have a team draw bounding boxes around objects in images to create training data for an object detection model.

USE CASE 2

Step through thousands of support messages and tag each one with a sentiment category for a text classifier.

USE CASE 3

Label audio clips with transcriptions or categories to build a training set for a speech recognition model.

USE CASE 4

Export completed annotations in a standardized format to plug directly into your model training pipeline.

What is it built with?

TypeScriptPythonDockerPostgreSQLNginx

How does it compare?

humansignal/label-studioinvoke-ai/invokeairecharts/recharts
Stars27,21527,11727,089
LanguageTypeScriptTypeScriptTypeScript
Setup difficultymoderatemoderateeasy
Complexity3/53/52/5
Audiencedatadesignerdeveloper

Figures from each repo's GitHub metadata at analysis time.

How do you get it running?

Difficulty · moderate Time to first run · 30min

Requires Docker for production setup, local pip install is quickest for solo use.

In plain English

Label Studio is an open-source tool for labeling data so it can be used to train machine learning models. In machine learning, models learn by example, and those examples need to be tagged or annotated by people first: drawing a box around the cat in a photo, marking which words in a sentence are names, choosing the right category for an audio clip, and so on. Label Studio gives a team a single web interface where they can do this kind of work on many different kinds of data and then export the result in a standardized format the model training code can read. The README explains that it supports audio, text, images, video, and time-series data. You install it, point it at a dataset, pick a labeling template that matches the task, and your annotators see a simple browser-based UI to step through the data and tag each item. It can also be customized to fit a particular dataset rather than only the built-in templates. The interface is meant for preparing raw data for a new model or improving an existing training set so the resulting model becomes more accurate. Someone would use it when they have a pile of unlabeled data and need humans (themselves, a team, or contractors) to attach the labels a model needs. It fits both small experiments running locally on one machine and larger production setups, and there is also a hosted edition offered by the maintainers as an alternative to self-hosting. It is distributed both as a Python package installable with pip, poetry, or Anaconda, and as a Docker image. A production deployment can run via Docker Compose alongside Nginx and PostgreSQL.

Copy-paste prompts

Prompt 1
Set up Label Studio locally with Docker Compose so my team can label images for an object detection model, and show me how to create the first project.
Prompt 2
I have 5,000 product review texts I need classified as positive, neutral, or negative. Walk me through setting up a Label Studio annotation project for this task.
Prompt 3
Write a Python script that exports completed annotations from Label Studio in COCO format ready for training a YOLOv8 model.
Prompt 4
How do I connect Label Studio to an S3 bucket so annotators can label images stored in the cloud without downloading them?
Prompt 5
Set up Label Studio with PostgreSQL as the database backend instead of SQLite for a production team annotation workflow.

Frequently asked questions

What is label-studio?

Label Studio is an open-source web app where teams label images, text, audio, and video to create training data for machine learning models, then export annotations in formats model training code can read.

What language is label-studio written in?

Mainly TypeScript. The stack also includes TypeScript, Python, Docker.

How hard is label-studio to set up?

Setup difficulty is rated moderate, with roughly 30min to a first successful run.

Who is label-studio for?

Mainly data.

Open on GitHub → Explain another repo

This repo across BitVibe Labs

Scan in gitsafehub Deploy in gitdeployhub humansignal on gitmyhub

Verify against the repo before relying on details.