Enhancing Self-Supervised Monocular Depth Estimation via Piece-Wise Pose Estimation and Geometric Constraints

Shyam, Pranjay; Okon, Alexandre; Yoo, HyunJin

Pranjay Shyam, Alexandre Okon, HyunJin Yoo; Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) Workshops, 2024, pp. 231-241

Abstract

Existing single and multi-frame monocular depth estimation (MDE) approaches lack depth estimation consistency around object edges, while single-frame approaches generate scale-ambiguous depth albeit at a lower computational complexity. We revisit the framework design to address these limitations and propose a joint approach that intertwines depth estimation and panoptic segmentation networks. We present an instance-aware patch-based contrastive loss to ensure depth consistency within an object in feature space. This approach disentangles the embedding triplet and independently refines anchor-positive and anchor-negative pairs, providing coherent depth within objects. Leveraging the panoptic information, we propose masking small objects during photometric loss computation while extracting 6-DoF pose estimates for dynamic objects in a piece-wise approach, thus facilitating depth estimation in dynamic scenes. We demonstrate this mechanism to be suited for single and multi-frame MDE. In addition, to ensure scale fidelity in single-frame MDE, we capitalize on the inherent linear relationship between computed depth and ground truth when using self-supervised photometric loss-based monocular depth estimation (MDE). For this, we propose using a multi-frame depth estimation as a teacher network to inject geometric insight into the student MDE via a global scaling factor, thus generating absolute depth. We further improve the teacher network architecture by introducing a multi-scale feature fusion mechanism that benefits scenarios with significant camera motion. We perform a comprehensive evaluation to validate the efficacy of the proposed mechanism and obtain state-of-the-art performance on the KITTI dataset.

Related Material

[pdf] [supp]

[bibtex]

@InProceedings{Shyam_2024_WACV, author = {Shyam, Pranjay and Okon, Alexandre and Yoo, HyunJin}, title = {Enhancing Self-Supervised Monocular Depth Estimation via Piece-Wise Pose Estimation and Geometric Constraints}, booktitle = {Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) Workshops}, month = {January}, year = {2024}, pages = {231-241} }