Towards Scene Understanding: Unsupervised Monocular Depth Estimation With Semantic-Aware Representation

Po-Yi Chen, Alexander H. Liu, Yen-Cheng Liu, Yu-Chiang Frank Wang; Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2019, pp. 2624-2632

Abstract


Monocular depth estimation is a challenging task in scene understanding, with the goal to acquire the geometric properties of 3D space from 2D images. Due to the lack of RGB-depth image pairs, unsupervised learning methods aim at deriving depth information with alternative supervision such as stereo pairs. However, most existing works fail to model the geometric structure of objects, which generally results from considering pixel-level objective functions during training. In this paper, we propose SceneNet to overcome this limitation with the aid of semantic understanding from segmentation. Moreover, our proposed model is able to perform region-aware depth estimation by enforcing semantics consistency between stereo pairs. In our experiments, we qualitatively and quantitatively verify the effectiveness and robustness of our model, which produces favorable results against the state-of-the-art approaches do.

Related Material


[pdf] [supp] [video]
[bibtex]
@InProceedings{Chen_2019_CVPR,
author = {Chen, Po-Yi and Liu, Alexander H. and Liu, Yen-Cheng and Wang, Yu-Chiang Frank},
title = {Towards Scene Understanding: Unsupervised Monocular Depth Estimation With Semantic-Aware Representation},
booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
month = {June},
year = {2019}
}