Multi-modal Characteristic Guided Depth Completion Network

Yongjin Lee, Seokjun Park, Beomgu Kang, HyunWook Park; Proceedings of the Asian Conference on Computer Vision (ACCV), 2022, pp. 4598-4612

Abstract


Depth completion techniques fuse sparse depth map from LiDAR with color image to generate accurate dense depth map. Typically, multi-modal techniques utilize complementary characteristics of each modality, overcoming the limited information from a single modality. Especially in the depth completion, LiDAR data has relatively dense depth information for objects in the near distance but lacks the information of distant object and its boundary. On the other hand, color image has dense information for objects even in the far distance including the object boundary. Thus, the complementary characteristics of the two modalities are well suited for fusion, and many depth completion studies have proposed fusion networks to address the sparsity of LiDAR data. However, the previous fusion networks tend to simply concatenate the two-modality data and rely on deep neural network to extract useful features, not considering the inherited characteristics of each modality. To enable the effective modality-aware fusion, we propose a confidence guidance module (CGM) that estimates confidence maps which emphasizes salient region for each modality. In experiment, we showed that the confidence map for LiDAR data focused on near area and object surface, while those for color image focused on distant area and object boundary. Also, we propose a shallow feature fusion module (SFFM) to combine two types of input modality. Furthermore, a parallel refinement stage for each modality is proposed to reduce the computation time. Our results showed that the proposed model showed much faster computation time and competitive performance compared to the top-ranked models on the KITTI depth completion online leaderboard.

Related Material


[pdf] [supp]
[bibtex]
@InProceedings{Lee_2022_ACCV, author = {Lee, Yongjin and Park, Seokjun and Kang, Beomgu and Park, HyunWook}, title = {Multi-modal Characteristic Guided Depth Completion Network}, booktitle = {Proceedings of the Asian Conference on Computer Vision (ACCV)}, month = {December}, year = {2022}, pages = {4598-4612} }