Semantic Multi-View Stereo: Jointly Estimating Objects and Voxels

Ali Osman Ulusoy, Michael J. Black, Andreas Geiger; Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017, pp. 2414-2423

Abstract


Dense 3D reconstruction from RGB images is a highly ill-posed problem due to occlusions, textureless or reflective surfaces, as well as other challenges. We propose object-level shape priors to address these ambiguities. Towards this goal, we formulate a probabilistic model that integrates multi-view image evidence with 3D shape information from multiple objects. Inference in this model yields a dense 3D reconstruction of the scene as well as the existence and precise 3D pose of the objects in it. Our approach is able to recover fine details not captured in the input shapes while defaulting to the input models in occluded regions where image evidence is weak. Due to its probabilistic nature, the approach is able to cope with the approximate geometry of the 3D models as well as input shapes that are not present in the scene. We evaluate the approach quantitatively on several challenging indoor and outdoor datasets.

Related Material


[pdf]
[bibtex]
@InProceedings{Ulusoy_2017_CVPR,
author = {Osman Ulusoy, Ali and Black, Michael J. and Geiger, Andreas},
title = {Semantic Multi-View Stereo: Jointly Estimating Objects and Voxels},
booktitle = {Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
month = {July},
year = {2017}
}