Weakly Misalignment-free Adaptive Feature Alignment for UAVs-based Multimodal Object Detection

Chen Chen, Jiahao Qi, Xingyue Liu, Kangcheng Bin, Ruigang Fu, Xikun Hu, Ping Zhong; Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024, pp. 26836-26845

Abstract


Visible-infrared (RGB-IR) image fusion has shown great potentials in object detection based on unmanned aerial vehicles (UAVs). However the weakly misalignment problem between multimodal image pairs limits its performance in object detection. Most existing methods often ignore the modality gap and emphasize a strict alignment resulting in an upper bound of alignment quality and an increase of implementation costs. To address these challenges we propose a novel method named Offset-guided Adaptive Feature Alignment (OAFA) which could adaptively adjust the relative positions between multimodal features. Considering the impact of modality gap on the cross-modality spatial matching a Cross-modality Spatial Offset Modeling (CSOM) module is designed to establish a common subspace to estimate the precise feature-level offsets. Then an Offset-guided Deformable Alignment and Fusion (ODAF) module is utilized to implicitly capture optimal fusion positions for detection task rather than conducting a strict alignment. Comprehensive experiments demonstrate that our method not only achieves state-of-the-art performance in the UAVs-based object detection task but also shows strong robustness to the weakly misalignment problem.

Related Material


[pdf] [supp]
[bibtex]
@InProceedings{Chen_2024_CVPR, author = {Chen, Chen and Qi, Jiahao and Liu, Xingyue and Bin, Kangcheng and Fu, Ruigang and Hu, Xikun and Zhong, Ping}, title = {Weakly Misalignment-free Adaptive Feature Alignment for UAVs-based Multimodal Object Detection}, booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)}, month = {June}, year = {2024}, pages = {26836-26845} }