Efficient Feature Extraction and Late Fusion Strategy for Audiovisual Emotional Mimicry Intensity Estimation

Yu, Jun; Zhu, Wangyuan; Zhu, Jichao; Cai, Zhongpeng; Zhao, Gongpeng; Zhang, Zerui; Xie, Guochen; Wei, Zhihong; Liu, Qingsong; Liang, Jiaen

Jun Yu, Wangyuan Zhu, Jichao Zhu, Zhongpeng Cai, Gongpeng Zhao, Zerui Zhang, Guochen Xie, Zhihong Wei, Qingsong Liu, Jiaen Liang; Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2024, pp. 4866-4872

Abstract

In this paper, we present the solution to the Emotional Mimicry Intensity (EMI) Estimation challenge, which is part of 6th Affective Behavior Analysis in-the-wild (ABAW) 2024. The EMI Estimation challenge task aims to evaluate the emotional intensity of seed videos by assessing them from a set of predefined emotion categories (i.e., "Admiration", "Amusement", "Determination", "Empathic Pain", "Excitement" and "Joy"). To tackle this challenge, we extracted rich dual-channel visual features based on ResNet18 and AUs for the video modality and effective single-channel features based on Wav2Vec2.0 for the audio modality. This allowed us to obtain comprehensive emotional features for the audiovisual modality. Additionally, leveraging a late fusion strategy, we averaged the predictions of the visual and acoustic models, resulting in a more accurate estimation of audiovisual emotional mimicry intensity. Experimental results confirmed the effectiveness of our approach, with the average Pearson's Correlation Coefficient (r) of 0.3288 for 6 emotional dimensions in the validation set, and 0.3594 in the test set. Eventually, we achieved third place in the competition.

Related Material

[pdf] [arXiv]

[bibtex]

@InProceedings{Yu_2024_CVPR, author = {Yu, Jun and Zhu, Wangyuan and Zhu, Jichao and Cai, Zhongpeng and Zhao, Gongpeng and Zhang, Zerui and Xie, Guochen and Wei, Zhihong and Liu, Qingsong and Liang, Jiaen}, title = {Efficient Feature Extraction and Late Fusion Strategy for Audiovisual Emotional Mimicry Intensity Estimation}, booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops}, month = {June}, year = {2024}, pages = {4866-4872} }