Deep RGB-D Saliency Detection With Depth-Sensitive Attention and Automatic Multi-Modal Fusion

Peng Sun, Wenhu Zhang, Huanyu Wang, Songyuan Li, Xi Li; Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2021, pp. 1407-1417

Abstract


RGB-D salient object detection (SOD) is usually formulated as a problem of classification or regression over two modalities, i.e., RGB and depth. Hence, effective RGB-D feature modeling and multi-modal feature fusion both play a vital role in RGB-D SOD. In this paper, we propose a depth-sensitive RGB feature modeling scheme using the depth-wise geometric prior of salient objects. In principle, the feature modeling scheme is carried out in a depth-sensitive attention module, which leads to the RGB feature enhancement as well as the background distraction reduction by capturing the depth geometry prior. Moreover, to perform effective multi-modal feature fusion, we further present an automatic architecture search approach for RGB-D SOD, which does well in finding out a feasible architecture from our specially designed multi-modal multi-scale search space. Extensive experiments on seven standard benchmarks demonstrate the effectiveness of the proposed approach against the state-of-the-art.

Related Material


[pdf] [supp]
[bibtex]
@InProceedings{Sun_2021_CVPR, author = {Sun, Peng and Zhang, Wenhu and Wang, Huanyu and Li, Songyuan and Li, Xi}, title = {Deep RGB-D Saliency Detection With Depth-Sensitive Attention and Automatic Multi-Modal Fusion}, booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)}, month = {June}, year = {2021}, pages = {1407-1417} }