Leveraging Pre-trained Multi-task Deep Models for Trustworthy Facial Analysis in Affective Behaviour Analysis In-the-Wild

Andrey V. Savchenko; Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2024, pp. 4703-4712

Abstract


This article presents our results for the sixth Affective Behavior Analysis in-the-wild (ABAW) competition. To improve the trustworthiness of facial analysis we study the possibility of using pre-trained deep models that extract reliable emotional features without the need to fine-tune the neural networks for a downstream task. In particular we introduce several lightweight models based on MobileViT MobileFaceNet EfficientNet and DDAMFN architectures trained in multi-task scenarios to recognize facial expressions valence and arousal on static photos. These neural networks extract frame-level features fed into a simple classifier e.g. linear feed-forward neural network to predict emotion intensity compound expressions and valence/arousal. Experimental results for three tasks from the sixth ABAW challenge demonstrate that our approach lets us significantly improve quality metrics on validation sets compared to existing non-ensemble techniques. As a result our solutions took second place in the compound expression recognition competition.

Related Material


[pdf]
[bibtex]
@InProceedings{Savchenko_2024_CVPR, author = {Savchenko, Andrey V.}, title = {Leveraging Pre-trained Multi-task Deep Models for Trustworthy Facial Analysis in Affective Behaviour Analysis In-the-Wild}, booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops}, month = {June}, year = {2024}, pages = {4703-4712} }