Multi-View Depth Estimation by Fusing Single-View Depth Probability With Multi-View Geometry

Bae, Gwangbin; Budvytis, Ignas; Cipolla, Roberto

Multi-View Depth Estimation by Fusing Single-View Depth Probability With Multi-View Geometry

Gwangbin Bae, Ignas Budvytis, Roberto Cipolla; Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022, pp. 2842-2851

Abstract

Multi-view depth estimation methods typically require the computation of a multi-view cost-volume, which leads to huge memory consumption and slow inference. Furthermore, multi-view matching can fail for texture-less surfaces, reflective surfaces and moving objects. For such failure modes, single-view depth estimation methods are often more reliable. To this end, we propose MaGNet, a novel framework for fusing single-view depth probability with multi-view geometry, to improve the accuracy, robustness and efficiency of multi-view depth estimation. For each frame, MaGNet estimates a single-view depth probability distribution, parameterized as a pixel-wise Gaussian. The distribution estimated for the reference frame is then used to sample per-pixel depth candidates. Such probabilistic sampling enables the network to achieve higher accuracy while evaluating fewer depth candidates. We also propose depth consistency weighting for the multi-view matching score, to ensure that the multi-view depth is consistent with the single-view predictions. The proposed method achieves state-of-the-art performance on ScanNet, 7-Scenes and KITTI. Qualitative evaluation demonstrates that our method is more robust against challenging artifacts such as texture-less/reflective surfaces and moving objects. Our code and model weights are available at https://github.com/baegwangbin/MaGNet.

Related Material

[pdf] [supp] [arXiv]

[bibtex]

@InProceedings{Bae_2022_CVPR, author = {Bae, Gwangbin and Budvytis, Ignas and Cipolla, Roberto}, title = {Multi-View Depth Estimation by Fusing Single-View Depth Probability With Multi-View Geometry}, booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)}, month = {June}, year = {2022}, pages = {2842-2851} }