A Unified Approach to Facial Affect Analysis: The MAE-Face Visual Representation
Facial affect analysis is essential for understanding human expressions and behaviors, encompassing action unit (AU) detection, expression (EXPR) recognition, and valence-arousal (VA) estimation. The CVPR 2023 Competition on Affective Behavior Analysis in-the-wild (ABAW) is dedicated to providing a high-quality and large-scale Aff-wild2 dataset for identifying widely used emotion representations. In this paper, we employ MAE-Face as a unified approach to develop robust visual representations for facial affect analysis. We propose multiple techniques to improve its fine-tuning performance on various downstream tasks, incorporating a two-pass pre-training process and a two-pass fine-tuning process. Our approach exhibits strong results on numerous datasets, highlighting its versatility. Moreover, the proposed model acts as a fundamental component for our final framework in the ABAW5 competition. Our submission achieves outstanding outcomes, ranking first place in the AU and EXPR tracks and second place in the VA track.