Distribution-Consistent Modal Recovering for Incomplete Multimodal Learning

Yuanzhi Wang, Zhen Cui, Yong Li; Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2023, pp. 22025-22034

Abstract


Recovering missed modality is popular in incomplete multimodal learning because it usually benefits downstream tasks. However, the existing methods often directly estimate missed modalities from the observed ones by deep neural networks, lacking consideration of the distribution gap between modalities, resulting in the inconsistency of distributions between the recovered data and true data. To mitigate this issue, in this work, we propose a novel recovery paradigm, Distribution-Consistent Modal Recovering (DiCMoR), to transfer the distributions from available modalities to missed modalities, which thus maintains the distribution consistency of recovered data. In particular, we design a class-specific flow based modality recovery method to transform cross-modal distributions on the condition of sample class, which could well predict a distribution-consistent space for missing modality by virtue of the invertibility and exact density estimation of normalizing flow. The generated data from the predicted distribution is jointly integrated with available modalities for the task of classification. Experiments demonstrate that DiCMoR gains superior performances and is more robust than existing state-of-the-art methods under various missing patterns. Visualization results show that the distribution gaps between recovered modalities and missing modalities are mitigated.

Related Material


[pdf] [supp]
[bibtex]
@InProceedings{Wang_2023_ICCV, author = {Wang, Yuanzhi and Cui, Zhen and Li, Yong}, title = {Distribution-Consistent Modal Recovering for Incomplete Multimodal Learning}, booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)}, month = {October}, year = {2023}, pages = {22025-22034} }