InfoBridge: Balanced Multimodal Integration through Conditional Dependency Modeling

Li, Chenxin; Liu, Yifan; Pan, Panwang; Liu, Hengyu; Liu, Xinyu; Li, Wuyang; Wang, Cheng; Yu, Weihao; Lin, Yiyang; Yuan, Yixuan

Chenxin Li, Yifan Liu, Panwang Pan, Hengyu Liu, Xinyu Liu, Wuyang Li, Cheng Wang, Weihao Yu, Yiyang Lin, Yixuan Yuan; Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2025, pp. 393-404

Abstract

Developing systems that interpret diverse real-world signals remains a fundamental challenge in multimodal learning. Current approaches face significant obstacles from inherent modal heterogeneity. While existing methods attempt to enhance fusion through cross-modal alignment or interaction mechanisms, they often struggle to balance effective integration with preserving modality-specific information. We introduce InfoBridge, a novel framework grounded in conditional information maximization principles addressing these limitations. Our approach reframes multimodal fusion through two key innovations: (i) we formulate fusion as conditional mutual information optimization with integrated protective margin that simultaneously encourages cross-modal information sharing while safeguarding against over-fusion eliminating modal characteristics; and (ii) we enable fine-grained contextual fusion by leveraging modality-specific conditions to guide integration. Extensive evaluations across benchmarks demonstrate that InfoBridge consistently outperforms state-of-the-art multimodal architectures, establishing a principled approach that better captures complementary information across input signals. Project page: https://cuhk-aim-group.github.io/InfoBridge/.

Related Material

[pdf]

[bibtex]

@InProceedings{Li_2025_ICCV, author = {Li, Chenxin and Liu, Yifan and Pan, Panwang and Liu, Hengyu and Liu, Xinyu and Li, Wuyang and Wang, Cheng and Yu, Weihao and Lin, Yiyang and Yuan, Yixuan}, title = {InfoBridge: Balanced Multimodal Integration through Conditional Dependency Modeling}, booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)}, month = {October}, year = {2025}, pages = {393-404} }