Run a live webcam demo to track a chosen object in real time using a pretrained model.
Evaluate a Siamese tracker on standard benchmarks like OTB2015 or VOT2018 using built-in test scripts.
Use SiamMask to produce pixel-level segmentation masks around a tracked object, not just a bounding box.
Download pretrained models from the model zoo and test them on your own video files.
Requires a CUDA-compatible GPU, PyTorch with the matching CUDA version, and manual download of pretrained models from the model zoo.
PySOT is a Python toolkit from SenseTime's video intelligence research team for tracking a single object across video frames. Given a video and a starting bounding box around an object, a tracker follows that object through subsequent frames even when it moves, changes appearance, or becomes partially obscured. This is a research platform, meaning it is designed for academics and engineers experimenting with new tracking algorithms rather than as a finished product. The toolkit implements several tracking algorithms from published research papers, all belonging to a family of approaches called Siamese networks. These methods work by learning to compare two image patches: a small crop of the target object from the first frame, and candidate regions from later frames. The algorithms included are SiamFC, SiamRPN, DaSiamRPN, SiamRPN++, and SiamMask. SiamMask is notable because it also produces a pixel-level segmentation mask of the tracked object, not just a bounding box. The system is built on top of PyTorch and supports several neural network backbone architectures including variants of ResNet and MobileNetV2. Pretrained models for the included algorithms are available to download from a model zoo linked in the repository. The toolkit includes command-line scripts for running a webcam demo, testing a tracker against a downloaded benchmark dataset, and evaluating the results. It supports evaluation on several standard tracking benchmarks including OTB2015, VOT2016, VOT2018, LaSOT, and UAV123. Training instructions are in a separate file. The codebase has supported multiple published papers from SenseTime, including work presented at CVPR 2018, CVPR 2019, and ECCV 2018. It is released under the Apache 2.0 license.
← stvir on gitmyhub — every repo by this author, as a profile.
Verify against the repo before relying on details.