DeCoTR: Enhancing Depth Completion with 2D and 3D Attentions

Yunxiao Shi, Manish Kumar Singh, Hong Cai, Fatih Porikli; Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024, pp. 10736-10746

Abstract


In this paper we introduce a novel approach that harnesses both 2D and 3D attentions to enable highly accurate depth completion without requiring iterative spatial propagations. Specifically we first enhance a baseline convolutional depth completion model by applying attention to 2D features in the bottleneck and skip connections. This effectively improves the performance of this simple network and sets it on par with the latest complex transformer-based models. Leveraging the initial depths and features from this network we uplift the 2D features to form a 3D point cloud and construct a 3D point transformer to process it allowing the model to explicitly learn and exploit 3D geometric features. In addition we propose normalization techniques to process the point cloud which improves learning and leads to better accuracy than directly using point transformers off the shelf. Furthermore we incorporate global attention on downsampled point cloud features which enables long-range context while still being computationally feasible. We evaluate our method DeCoTR on established depth completion benchmarks including NYU Depth V2 and KITTI showcasing that it sets new state-of-the-art performance. We further conduct zero-shot evaluations on ScanNet and DDAD benchmarks and demonstrate that DeCoTR has superior generalizability compared to existing approaches.

Related Material


[pdf] [arXiv]
[bibtex]
@InProceedings{Shi_2024_CVPR, author = {Shi, Yunxiao and Singh, Manish Kumar and Cai, Hong and Porikli, Fatih}, title = {DeCoTR: Enhancing Depth Completion with 2D and 3D Attentions}, booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)}, month = {June}, year = {2024}, pages = {10736-10746} }