-
[pdf]
[bibtex]@InProceedings{Hao_2026_CVPR, author = {Hao, Xiaohui and Pu, Yanglin and Wang, Yongjun and She, Rui}, title = {Distribution-Aligned Multimodal Fusion for Robust Object Detection}, booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)}, month = {June}, year = {2026}, pages = {25494-25503} }
Distribution-Aligned Multimodal Fusion for Robust Object Detection
Abstract
Cross-degradation generalization remains a critical challenge for RGB-infrared multimodal object detection, especially when training data covers limited degradation types. This paper presents a distribution alignment framework with a key insight: aligning fused features to the pretrained distribution where the frozen detector performs optimally, rather than adapting to training-specific degradations. By freezing the pretrained detector and training only a lightweight fusion module, our approach leverages complementary infrared information to reduce distribution shift while maintaining computational efficiency. The method achieves state-of-the-art results on three benchmarks with 4x faster training. Critically, we demonstrate that aligning to the pretrained distribution substantially outperforms aligning to training degradations when generalizing to unseen scenarios.
Related Material

