Facial Action Unit Recognition in the Wild With Multi-Task CNN Self-Training for the EmotioNet Challenge

Philipp Werner, Frerk Saxen, Ayoub Al-Hamadi; Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2020, pp. 410-411

Abstract


Automatic understanding of facial behavior is hampered by factors such as occlusion, illumination, non-frontal head pose, low image resolution, or limitations in labeled training data. The EmotioNet 2020 Challenge addresses these issues through a competition on recognizing facial action units on in-the-wild data. We propose to combine multi-task and self-training to make best use of the small manually / fully labeled and the large weakly / partially labeled training datasets provided by the challenge organizers. With our approach (and without using additional data) we achieve the second place in the 2020 challenge -- with a performance gap of only 0.05% to the challenge winner and of 5.9% to the third place. On the 2018 challenge evaluation data our method outperforms all other known results.

Related Material


[pdf] [video]
[bibtex]
@InProceedings{Werner_2020_CVPR_Workshops,
author = {Werner, Philipp and Saxen, Frerk and Al-Hamadi, Ayoub},
title = {Facial Action Unit Recognition in the Wild With Multi-Task CNN Self-Training for the EmotioNet Challenge},
booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops},
month = {June},
year = {2020}
}