Structure-from-Motion
Structure-from-Motion (SfM) is a robust vision pipeline to estimate camera parameters and a sparse point cloud from an unordered set of images.
Why it is Important
- Robust and efficient pipeline that just requires a set of images to build a 3D map.
- Widely used as a backend in 3D vision to estimate camera intrinsics and extrinsics, e.g. for photogrammetry or novel-view-synthesis (NeRF).
- Sparse pointcloud enables efficient and highly accurate localization of new images in the map.
Key Feature
- Feature Detection and Matching: Local features have to be detected and matched across images.
- Visual Localization: Estimating the camera pose of an image w.r.t. a sparse 3D map is at the core of incremental mapping pipelines.
Bundle Adjustment: Camera poses and the 3D point cloud are jointly refined with a large non-linear optimization called Bundle Adjustment.
Applications: Structure-from-Motion is widely used in many computer vision tasks, such as:
- Photogrammetry
- 3D reconstruction
- Visual localization
Conclusion
Structure-from-Motion is the State-of-the-Art approach to accurately estimate camera parameters from an image collection, and is widely used as a backend in computer vision systems.
Publications
- LaMAR: Benchmarking Localization and Mapping for Augmented Reality (ECCV 2022) [Project page]
- Camera Pose Estimation using Implicit Distortion Models (CVPR 2022) [Paper]
- Pixel-Perfect Structure-from-Motion with Featuremetric Refinement (ICCV 2021) [Project page]
- Back to the Feature: Learning Robust Camera Localization from Pixels to Pose (CVPR 2021) [Project page]