Decoupling Identity Confounders for Enhanced Facial Expression Recognition: An Information-Theoretic Approach

Mohd Aquib, Nishchal K. Verma, M. Jaleel Akhtar; Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2025, pp. 5561-5570

Abstract


Facial expression recognition (FER) remains challenging due to subtle inter-class variations and significant intra-class differences, often exacerbated by identity-specific features confounding the expression features. While recent methods attempt to disentangle identity from expression, they often rely on auxiliary labels or computationally expensive image generation, limiting scalability. To address this, we propose DICE-FER (Decoupling Identity Confounders for Enhanced FER), a novel framework that decouples identity confounders from expression features through mutual information (MI) estimation without requiring labels or reconstruction. DICE-FER processes paired images with shared expressions, partitioning their features into (1) expression representations which is maximized via cross-referenced MI and (2) identity representations which is adversarially minimized for MI with expression. This dual optimization isolates identity-invariant expression cues while eliminating the need for costly generation or subject annotation. Experiments on benchmark datasets demonstrate that DICE-FER outperforms state-of-the-art methods in both disentanglement quality and recognition accuracy.

Related Material


[pdf]
[bibtex]
@InProceedings{Aquib_2025_CVPR, author = {Aquib, Mohd and Verma, Nishchal K. and Akhtar, M. Jaleel}, title = {Decoupling Identity Confounders for Enhanced Facial Expression Recognition: An Information-Theoretic Approach}, booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops}, month = {June}, year = {2025}, pages = {5561-5570} }