Multi-scale Residual Interaction for RGB-D Salient Object Detection

Mingjun Hu, Xiaoqin Zhang, Li Zhao; Proceedings of the Asian Conference on Computer Vision (ACCV), 2022, pp. 2494-2509


RGB-D salient object detection (SOD) is used to detect the most attractive object in the scene. There is a problem in front of the existing RGB-D SOD task: how to integrate the different context information between the RGB and depth map effectively. In this work, we propose the Siamese Residual Interactive Refinement Network (SiamRIR) equipped with the encoder and decoder to handle the above problem. Concretely, we adopt the Siamese Network shared parameters to encode two modalities and fuse them during decoding phase. Then, we design the Multi-scale Residual Interavtive Refinement Block (RIRB) which contains Residual Interactive Module (RIM) and Residual Refinement Module (RRM). This block utilizes the multi-type cues to fuse and refine features, where RIM takes interaction between modalities to integrate the complementary regions with residual manner, and RRM refines features during fusion phase by incorporating spatial detail context with multi-scale manner. Extensive experiments on five benchmarks demonstrate that our method outperforms the state-of-the-art RGB-D SOD methods both quantitatively and qualitatively.

Related Material

@InProceedings{Hu_2022_ACCV, author = {Hu, Mingjun and Zhang, Xiaoqin and Zhao, Li}, title = {Multi-scale Residual Interaction for RGB-D Salient Object Detection}, booktitle = {Proceedings of the Asian Conference on Computer Vision (ACCV)}, month = {December}, year = {2022}, pages = {2494-2509} }