Read the arXiv paper to understand how a transformer architecture improves two-view bundle adjustment for 3D reconstruction pipelines.
Follow the repository for a future code release to experiment with camera pose refinement in your own structure-from-motion workflow.
No code is available in the repository yet, only the accompanying arXiv paper exists.
BA-T is a research project from computer vision researchers at TU Munich, MCML, and ETH Zurich. The full name is Bundle Adjustment Transformer, and the work is described as an iterative transformer-based method for two-view bundle adjustment. Bundle adjustment is a step in 3D reconstruction pipelines: given two photos of the same scene taken from different positions, it refines the estimated camera positions and the locations of visible 3D points so that everything lines up as accurately as possible. The two-view part means the method focuses on pairs of images rather than larger sets. The transformer in the name refers to an architecture style that processes relationships between many elements at once, originally developed in language processing and since applied widely in computer vision tasks. The BA-T project applies that idea to the bundle adjustment problem in an iterative way, refining estimates over multiple passes. The project accompanies an academic paper published as a preprint on arXiv. The authors are Ganlin Zhang, Weirong Chen, Daniel Cremers, and Xi Wang, all affiliated with TU Munich and the Munich Center for Machine Learning, with one author also at ETH Zurich. At the time of this writing, the repository contains no code. The README states only that code is coming soon. No installation instructions, usage examples, or architecture details beyond the title are provided yet. Readers interested in the method should consult the arXiv paper directly.
← zhangganlin on gitmyhub — every repo by this author, as a profile.
Verify against the repo before relying on details.