We propose a novel prior for variational 3D reconstruction that favors symmetric solutions when dealing with noisy or incomplete data.
We detect symmetries from incomplete data while explicitly handling unexplored areas to allow for plausible scene completions.
The set of detected symmetries is then enforced on their respective support domain within a variational reconstruction framework.
This formulation also handles multiple symmetries sharing the same support.
The proposed approach is able to denoise and complete surface geometry and even hallucinate large scene parts.
We demonstrate in several experiments the benefit of harnessing symmetries when regularizing a surface.
In semantic 3D modeling the goal is to find a dense geometric model from images and at the same time also infer the semantic classes of the individual parts of the reconstructed model. Having a semantically annotated dense 3D model gives a much richer representation of the scene than just the geometry. For example questions such as what is the volume of a building can directly be answered. This is difficult with just a geometric model where the knowledge about which parts of the geometry belong to the building is not present. Also by solving the problem of dense 3D reconstruction and class segmentation jointly, prior knowledge such as the ground is usually a surface which is close to horizontal can be included.
PlaneSweepLib (PSL) is a library that implements the plane sweeping stereo matching algorithm. It is written in C++/CUDA by Christian Häne. It contains an implementation for the pinhole camera model and for the unified projection camera model (fisheye cameras). The package comes with small test datasets and applications for both camera models and runs on Linux and Windows. It is released under the terms of the GPLv3 license.
Multiple view geometry is well-understood for the case of ideal pinhole cameras and many algorithms exist to estimate epipolar geometry, trifocal tensors or homographies. In this research we focus on the problem of multiple view relations between images with radial distortion. One important case is e.g. in sequential approaches where one registers an unknown image (potentially with radial distortion) to a set of previously calibrated images. Here, we introduce the single-sided radial fundamental matrix as well as algorithms for estimating and decomposing it.
We present an algorithm to detect changes in the geometry of an urban environment using some images observing its current state. The proposed method can be used to significantly optimize the process of updating the 3D model of a city changing over time, by restricting this process to only those areas where changes are detected.
The method also accounts for all the challenges involved in a large scale application of change detection, such as, inaccuracies in the input geometry, errors in the geo-location data of the images, as well as, the limited amount of information due to sparse imagery.
Dense Reconstruction from Symmetry
A system is presented that takes a single image as an input
(e.g. showing the interior of St.Peter's Basilica) and automatically
detects an arbitrarily oriented symmetry plane in 3D space. Given this
symmetry plane a second camera is hallucinated that serves as a virtual
second image for dense 3D reconstruction, where the point of view for
reconstruction can be chosen on the symmetry plane. This naturally creates
a symmetry in the matching costs for dense stereo. Alternatively, we
also show how to enforce the 3D symmetry in dense depth estimation for
the original image. The two representations are qualitatively compared
on several real world images, that also validate our fully automatic approach
for dense single image reconstruction.
We propose a new approach for structure from motion, where symmetry
relations in the 3D structure are automatically recovered from
multiple images and then imposed within a new constrained bundle
adjustment formulation that incorporates robust priors on the expected
model shape. Our approach significantly reduces drift through
"structural" loop closures and improves the accuracy of
reconstructions in urban scenes. We also use the discovered symmetries
to estimate a natural coordinate system and complete the 3D model.
We present a system for 3D reconstruction of large-scale outdoor scenes based on monocular motion stereo. Ours is the first such system to run at interactive frame rates on a mobile device (Google Project Tango Tablet), thus allowing a user to reconstruct scenes "on the go" by simply walking around them. We utilize the device's GPU to compute depth maps using plane sweep stereo. We then fuse the depth maps into a global model of the environment represented as a truncated signed distance function in a spatially hashed voxel grid. We observe that in contrast to reconstructing objects in a small volume of interest, or using the near outlier-free data provided by depth sensors, one can rely less on free-space measurements for suppressing outliers in unbounded large-scale scenes. Consequently, we propose a set of simple filtering operations to remove unreliable depth estimates and experimentally demonstrate the benefit of strongly filtering depth maps. We extensively evaluate the system with real as well as synthetic datasets.
Recent advances in Structure-from-Motion not only enable the reconstruction of large scale scenes, but are also
able to detect ambiguous structures caused by repeating elements that might result in incorrect reconstructions. Yet, it
is not always possible to fully reconstruct a scene. The images required to merge different sub-models might be missing or it might be impossible to acquire such images in the
first place due to occlusions or the structure of the scene.
The problem of aligning multiple reconstructions that do
not have visual overlap is impossible to solve in general.
An important variant of this problem is the case in which
individual sides of a building can be reconstructed but not
joined due to the missing visual overlap. In this paper, we
present a combinatorial approach for solving this variant
by automatically stitching multiple sides of a building together. Our approach exploits symmetries and semantic information to reason about the possible geometric relations
between the individual models. We show that our approach
is able to reconstruct complete building models where traditional SfM ends up with disconnected building sides.
Structure-from-Motion can achieve accurate reconstructions of urban scenes. However, reconstructing the inside and the outside of a building into a single model is very challenging due to the lack of visual overlap and the change of lighting conditions between the two scenes. We propose a solution to align disconnected indoor and outdoor models of the same building into a single 3D model. Our approach leverages semantic information, specifically window detections, in multiple scenes to obtain candidate matches from which an alignment hypothesis can be computed. To determine the best alignment, we propose a novel cost function that takes both the number of window matches and the intersection of the aligned models into account. We evaluate our solution on multiple datasets.
We propose a novel two-step method for estimating the intrinsic and extrinsic
calibration of any radially symmetric camera, including non-central systems.
The first step consists of estimating the camera pose, given a Structure from
Motion (SfM) model, up to the translation along the optical axis. As a second
step, we obtain the calibration by finding the translation of the camera center
using an ordering constraint. The method makes use of the 1D radial camera
model, which allows us to effectively handle any radially symmetric camera,
including non-central ones. Using this ordering constraint, we show that the we
are able to calibrate several different (central and non-central) Wide Field of
View (WFOV) cameras, including fisheye, hyper-catadioptric and spherical
catadioptric cameras, as well as pinhole cameras, using a single image or
jointly solving for several views.
We present the first full Structure-from-Motion pipeline based on privacy preserving line features.
|