Rotated Multi-Scale Interaction Network for Referring Remote Sensing Image Segmentation

Sihan Liu, Yiwei Ma, Xiaoqing Zhang, Haowei Wang, Jiayi Ji, Xiaoshuai Sun, Rongrong Ji; Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024, pp. 26658-26668

Abstract


Referring Remote Sensing Image Segmentation (RRSIS) is a new challenge that combines computer vision and natural language processing. Traditional Referring Image Segmentation (RIS) approaches have been impeded by the complex spatial scales and orientations found in aerial imagery leading to suboptimal segmentation results. To address these challenges we introduce the Rotated Multi-Scale Interaction Network (RMSIN) an innovative approach designed for the unique demands of RRSIS. RMSIN incorporates an Intra-scale Interaction Module (IIM) to effectively address the fine-grained detail required at multiple scales and a Cross-scale Interaction Module (CIM) for integrating these details coherently across the network. Furthermore RMSIN employs an Adaptive Rotated Convolution (ARC) to account for the diverse orientations of objects a novel contribution that significantly enhances segmentation accuracy. To assess the efficacy of RMSIN we have curated an expansive dataset comprising 17402 image-caption-mask triplets which is unparalleled in terms of scale and variety. This dataset not only presents the model with a wide range of spatial and rotational scenarios but also establishes a stringent benchmark for the RRSIS task ensuring a rigorous evaluation of performance. Experimental evaluations demonstrate the exceptional performance of RMSIN surpassing existing state-of-the-art models by a significant margin. Datasets and code are available at https://github.com/Lsan2401/RMSIN.

Related Material


[pdf] [arXiv]
[bibtex]
@InProceedings{Liu_2024_CVPR, author = {Liu, Sihan and Ma, Yiwei and Zhang, Xiaoqing and Wang, Haowei and Ji, Jiayi and Sun, Xiaoshuai and Ji, Rongrong}, title = {Rotated Multi-Scale Interaction Network for Referring Remote Sensing Image Segmentation}, booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)}, month = {June}, year = {2024}, pages = {26658-26668} }