Label thousands of product photos with bounding boxes to build a training dataset for a custom YOLO detection model.
Use the Segment Anything model to click on any object and automatically generate a pixel-perfect outline mask.
Annotate video frames for pose estimation or document text detection using built-in OCR models.
Export finished annotations in COCO, YOLO, or VOC format ready to feed into a training pipeline.
Requires downloading specific ONNX model files separately for each AI model you want to use.
X-AnyLabeling is a desktop tool for annotating images and videos with the help of AI models. Annotation means drawing labels on images to mark where objects are, what type they are, or what they look like. This kind of labeled data is what AI models need during training, and creating it by hand is slow work. X-AnyLabeling is designed to speed that up by letting AI models do an initial pass at labeling, which a human can then review and correct. The tool supports a wide range of labeling types: bounding boxes around objects, pixel-level outlines (called segmentation masks), points for pose estimation, rotated boxes for objects at odd angles, text regions for document scanning, and more. It can work with both still images and video frames. Once you have finished labeling, it can export your annotations in formats used by popular training frameworks, including COCO, YOLO, VOC, and several others. What makes it more than a basic drawing tool is the built-in model library. You can run AI models like YOLO variants, Segment Anything (a model from Meta that can outline any object you click on), Grounding DINO (which finds objects based on text descriptions you type), OCR tools for reading text in images, and vision-language models including Gemini and ChatGPT integrations. These models run locally using the ONNX Runtime or TensorRT backends, meaning you do not need to send your data to an external server unless you choose the remote inference option. The interface supports English, Chinese, Japanese, and Korean. It runs on Windows, Linux, and macOS. You can add your own custom models if the built-in library does not cover your use case. X-AnyLabeling is actively maintained and receives regular updates. Recent additions include support for SAM 3, TensorRT-accelerated YOLO inference, PaddleOCR document parsing, and 3D cuboid annotation. It is licensed under LGPL v3.
← cvhub520 on gitmyhub — every repo by this author, as a profile.
Verify against the repo before relying on details.