Build a security camera system that detects people and vehicles in real time.
Train a custom model to identify defects on a manufacturing assembly line.
Deploy object detection on a mobile app or embedded device using TFLite or CoreML.
Create an augmented reality app that recognizes and labels objects in the camera feed.
PyTorch installation and model download can take 10-15 minutes depending on internet speed and system.
YOLOv5 is a computer vision model for detecting and identifying objects in images and videos in real time. The name YOLO stands for You Only Look Once, describing an approach to object detection where the model scans an image a single time and produces all detection results simultaneously, rather than processing it in multiple passes. This makes it fast enough for real-time video applications. The model can identify the location and category of multiple objects in a single image, drawing bounding boxes around each detected item and labeling them. YOLOv5 was developed by Ultralytics and is built on the PyTorch deep learning framework. It comes in several size variants trading speed against accuracy, from a small model suitable for embedded hardware to a larger model for maximum precision. The repository includes tools for training the model on your own custom dataset of labeled images, running inference on images or video streams, and exporting the trained model to various deployment formats including ONNX (a portable model format), Apple CoreML for iOS apps, and TFLite for Android or embedded devices. It also supports image segmentation and image classification tasks beyond the core object detection capability. The project integrates with Ultralytics Hub for cloud training and model management. The README notes that a newer model called YOLO11 is now available with improved performance, and encourages users to consider upgrading. The tech stack is Python with PyTorch, and deployment can target CPUs, NVIDIA GPUs, Apple Silicon, and mobile chipsets. You would use YOLOv5 when building a system that needs to identify and locate objects in images or video, such as security cameras, manufacturing quality control, autonomous vehicles, or augmented reality applications.
Generated 2026-05-18 · Model: sonnet-4-6 · Verify against the repo before relying on details.