Analysis updated 2026-05-18
Generate a labeled YOLO-format dataset for a custom object category without manually annotating any images.
Score the quality of an existing image dataset and predict the detection accuracy it will produce before training.
Run an active learning loop to automatically improve a dataset that scored below your quality threshold.
| ericchen931209/auto-dataset-builder | 0marildo/imago | agentlexi/agent-lexi | |
|---|---|---|---|
| Stars | 3 | 3 | 3 |
| Language | Python | Python | Python |
| Setup difficulty | moderate | easy | moderate |
| Complexity | 4/5 | 2/5 | 4/5 |
| Audience | researcher | general | vibe coder |
Figures from each repo's GitHub metadata at analysis time.
Requires Docker Desktop, first run pulls images and takes about 3 minutes. Google Search API key is optional.
Auto Dataset Builder (ADB) is a platform that creates labeled training datasets for computer vision models, starting from nothing but a plain English description. You type something like "Build a Taiwan motorcycle detection dataset" and the system handles everything from finding images to annotating them and checking their quality. The pipeline has several stages that run automatically. First it collects images and video frames from YouTube (Creative Commons licensed clips only) and optionally from Google Image Search. It then extracts useful frames, runs them through three annotation steps in sequence, cleans out blurry or poorly lit images, and exports a finished dataset in the format used by YOLO object detection models. The annotation process chains three tools together. YOLOv11, an object detection model, proposes initial bounding boxes around objects. SAM2, a segmentation model, refines those boxes into more accurate boundaries. A vision language model then verifies each annotation, acting as a final check on accuracy. The project's main research contribution is a quality metric called Neural DQS (Dataset Quality Score). It takes six measurements of a dataset, including how diverse the images are, how sharp they are, how well-balanced the object classes are, and how accurate the annotations are, then feeds those numbers into a small trained model to predict the mean average precision (mAP) the trained detector will achieve before you even start training. According to the README, this score correlated at 0.929 with actual mAP across 96 test datasets. If the quality score falls below a threshold, an active learning loop runs to collect more varied examples and re-annotate them until quality improves. The system includes a web dashboard built with Vue 3 and a FastAPI backend. Setup is Docker-based: clone the repo, run docker compose up, and a browser dashboard opens at localhost:3000. No GPU is required. Optionally add a Google Search API key for image collection. The project is MIT licensed and is accompanied by a published research paper on Zenodo.
A platform that builds labeled computer vision training datasets automatically from a plain-English description, using chained AI tools for annotation and quality scoring.
Mainly Python. The stack also includes Python, FastAPI, Vue 3.
MIT license, use freely for any purpose, including commercial, as long as you keep the copyright notice.
Setup difficulty is rated moderate, with roughly 30min to a first successful run.
Mainly researcher.
This repo across BitVibe Labs
Verify against the repo before relying on details.