Facial Expression Recognition Based on Multi-Modal Features for Videos in the Wild

Liu, Chuanhe; Zhang, Xinjie; Liu, Xiaolong; Zhang, Tenggan; Meng, Liyu; Liu, Yuchen; Deng, Yuanyuan; Jiang, Wenqiang

Chuanhe Liu, Xinjie Zhang, Xiaolong Liu, Tenggan Zhang, Liyu Meng, Yuchen Liu, Yuanyuan Deng, Wenqiang Jiang; Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2023, pp. 5872-5879

Abstract

This paper presents our work to the Expression Classification Challenge of the 5th Affective Behavior Analysis in-the-wild (ABAW) Competition. In our method, the multimodal features are extracted by several different pertained models, which are used to build different combinations to capture more effective emotion information. Specifically, we extracted efficient facial expression features using MAE encoder pre-trained with a large-scale face dataset. For these combinations of visual and audio modal features, we utilize two kinds of temporal encoders to explore the temporal contextual information in the data. In addition, we employ several ensemble strategies for different experimental settings to obtain the most accurate expression recognition results. Our system achieves the average F1 Score of 0.4072 on the test set of Aff-wild2 ranking 2nd, which proves the effectiveness of our method.

Related Material

[pdf]

[bibtex]

@InProceedings{Liu_2023_CVPR, author = {Liu, Chuanhe and Zhang, Xinjie and Liu, Xiaolong and Zhang, Tenggan and Meng, Liyu and Liu, Yuchen and Deng, Yuanyuan and Jiang, Wenqiang}, title = {Facial Expression Recognition Based on Multi-Modal Features for Videos in the Wild}, booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops}, month = {June}, year = {2023}, pages = {5872-5879} }