EmotiEffNets for Facial Processing in Video-Based Valence-Arousal Prediction, Expression Classification and Action Unit Detection

Andrey V. Savchenko; Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2023, pp. 5716-5724

Abstract


In this article, the pre-trained convolutional networks from the EmotiEffNet family for frame-level feature extraction are used for downstream emotion analysis tasks from the fifth Affective Behavior Analysis in-the-wild (ABAW) competition. In particular, we propose an ensemble of a multi-layered perceptron and the LightAutoML-based classifier. The post-processing by smoothing the results for sequential frames is implemented. Experimental results for the large-scale Aff-Wild2 database demonstrate that our model is much better than the baseline facial processing using VGGFace And ResNet. For example, our macro-averaged F1-scores of facial expression recognition and action unit detection on the testing set are 11-13% greater. Moreover, the concordance correlation coefficients for valence/arousal estimation are up to 30% higher when compared to the baseline.

Related Material


[pdf]
[bibtex]
@InProceedings{Savchenko_2023_CVPR, author = {Savchenko, Andrey V.}, title = {EmotiEffNets for Facial Processing in Video-Based Valence-Arousal Prediction, Expression Classification and Action Unit Detection}, booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops}, month = {June}, year = {2023}, pages = {5716-5724} }