Stitch multiple photos into a seamless panorama by finding matching points across overlapping images.
Figure out where a camera was positioned when a photo was taken by matching it against other known images.
Reconstruct a 3D model of a place or object from a collection of photos taken at different angles.
Build visual search or object recognition pipelines that need to find the same scene across different photos.
Install via pip and load pretrained weights from Hugging Face. Pick a feature detector (SuperPoint, DISK, ALIKED, or SIFT), run it on your images, then pass results to LightGlue. Training requires the separate glue-factory repo.
LightGlue is a deep neural network that figures out which points in one photo correspond to which points in another photo of the same scene. Given two images, it takes a set of detected keypoints and their descriptors from each image, runs them through an attention-based network, and returns matched pairs of points. This kind of matching is a building block for tasks like estimating camera position, stitching panoramas, or reconstructing 3D scenes from multiple photos. What makes LightGlue stand out is its adaptive design. For easy image pairs with clear overlap and good lighting, the network stops early and returns fast. For harder pairs, it runs through more layers and does more work. This means it uses only as much compute as each specific pair actually needs, rather than applying full processing to everything regardless of difficulty. The project ships pretrained weights for use with four feature detectors: SuperPoint, DISK, ALIKED, and SIFT. You choose the detector that fits your use case, run it on both images to extract features, then feed those features into LightGlue to get the matches. It also integrates with Hugging Face Transformers, so a basic setup requires just a pip install and a few lines of Python. Speed can be tuned through configuration. Lowering the adaptive thresholds makes it faster with a small accuracy trade-off. Disabling adaptivity entirely maximizes accuracy. With PyTorch compilation on a modern GPU, it reaches around 150 frames per second at 1024 keypoints per image, and around 20 frames per second on CPU at 512 keypoints. This repository contains only the inference code. Training and evaluation require the companion library called glue-factory, which is a separate project by the same team.
← cvg on gitmyhub — every repo by this author, as a profile.
Verify against the repo before relying on details.