-
[pdf]
[bibtex]@InProceedings{Jang_2025_WACV, author = {Jang, Junbo and Park, Chanyeong and Kim, Heegwang and Lee, Jiyoon and Paik, Joonki}, title = {Multispectral Object Detection Enhanced by Cross-Modal Information Complementary and Cosine Similarity Channel Resampling Modules}, booktitle = {Proceedings of the Winter Conference on Applications of Computer Vision (WACV)}, month = {February}, year = {2025}, pages = {9419-9428} }
Multispectral Object Detection Enhanced by Cross-Modal Information Complementary and Cosine Similarity Channel Resampling Modules
Abstract
Images obtained from different modalities can effectively enhance the accuracy and reliability of the detection model by complementing specialized information from visible (RGB) and infrared (IR) images. However integrating information from multiple modalities faces the following challenges: 1) distinct characteristics of RGB and IR images lead to the problem of modality imbalance 2) fusing multimodal information can greatly affect the detection accuracy as some of the unique information provided by each modality is lost during the integration process and 3) RGB and IR images are fused while preserving the noise of each modality. To address these issues we propose a novel multispectral object detection network which contains two main components; 1) Cross-modal Information Complementary (CIC) module and 2) Cosine Similarity Channel Resampling (CSCR) module. The proposed method addresses the modality imbalance problem and efficiently fuses RGB and IR images in the feature level. Extensive experimental results on LLVIP FLIR M3FD VEDAI and KAIST benchmark datasets verify the effectiveness and generalization performance of the proposed multispectral object detection network compared with other state-of-the-art methods.
Related Material