Tri-Subspaces Disentanglement for Multimodal Sentiment Analysis

Meng, Chunlei; Luo, Jiabin; Yan, Zhenglin; Yu, Zhenyu; Fu, Rong; Gan, Zhongxue; Ouyang, Chun

Chunlei Meng, Jiabin Luo, Zhenglin Yan, Zhenyu Yu, Rong Fu, Zhongxue Gan, Chun Ouyang; Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2026, pp. 8791-8800

Abstract

Multimodal Sentiment Analysis (MSA) integrates language, visual, and acoustic modalities to infer human sentiment. Most existing methods either focus on globally shared representations or modality-specific features, while overlooking signals that are shared only by certain modality pairs. This limits the expressiveness and discriminative power of multimodal representations. To address this limitation, we propose a Tri-Subspace Disentanglement (TSD) framework that explicitly factorizes features into three complementary subspaces: a common subspace capturing global consistency, submodally-shared subspaces modeling pairwise cross-modal synergies, and private subspaces preserving modality-specific cues. To keep these subspaces pure and independent, we introduce a decoupling supervisor together with structured regularization losses. We further design a Subspace-Aware Cross-Attention (SACA) fusion module that adaptively models and integrates information from the three subspaces to obtain richer and more robust representations. Experiments on CMU-MOSI and CMU-MOSEI demonstrate that TSD achieves state-of-the-art performance across all key metrics, reaching 0.691 MAE on CMU-MOSI and 54.6% Acc-7 on CMU-MOSEI under the unaligned setting, and also transfers well to multimodal intent recognition tasks. Ablation studies confirm that tri-subspaces disentanglement and SACA jointly enhance the modeling of multi-granular cross-modal sentiment cues.

Related Material

[pdf] [arXiv]

[bibtex]

@InProceedings{Meng_2026_CVPR, author = {Meng, Chunlei and Luo, Jiabin and Yan, Zhenglin and Yu, Zhenyu and Fu, Rong and Gan, Zhongxue and Ouyang, Chun}, title = {Tri-Subspaces Disentanglement for Multimodal Sentiment Analysis}, booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)}, month = {June}, year = {2026}, pages = {8791-8800} }