Fine-Grained Semantics-Aware Representation Enhancement for Self-Supervised Monocular Depth Estimation

Hyunyoung Jung, Eunhyeok Park, Sungjoo Yoo; Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2021, pp. 12642-12652

Abstract


Self-supervised monocular depth estimation has been widely studied, owing to its practical importance and recent promising improvements. However, most works suffer from limited supervision of photometric consistency, especially in weak texture regions and at object boundaries. To overcome this weakness, we propose novel ideas to improve self-supervised monocular depth estimation by leveraging cross-domain information, especially scene semantics. We focus on incorporating implicit semantic knowledge into geometric representation enhancement and suggest two ideas: a metric learning approach that exploits the semantics-guided local geometry to optimize intermediate depth representations and a novel feature fusion module that judiciously utilizes cross-modality between two heterogeneous feature representations. We comprehensively evaluate our methods on the KITTI dataset and demonstrate that our method outperforms state-of-the-art methods. The source code is available at https://github.com/hyBlue/FSRE-Depth.

Related Material


[pdf] [supp] [arXiv]
[bibtex]
@InProceedings{Jung_2021_ICCV, author = {Jung, Hyunyoung and Park, Eunhyeok and Yoo, Sungjoo}, title = {Fine-Grained Semantics-Aware Representation Enhancement for Self-Supervised Monocular Depth Estimation}, booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)}, month = {October}, year = {2021}, pages = {12642-12652} }