Weakly-Supervised Deep Convolutional Neural Network Learning for Facial Action Unit Intensity Estimation

Yong Zhang, Weiming Dong, Bao-Gang Hu, Qiang Ji; Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018, pp. 2314-2323

Abstract


Facial action unit (AU) intensity estimation plays an important role in affective computing and human-computer interaction. Recent works have introduced deep neural networks for AU intensity estimation, but they require a large amount of intensity annotations. AU annotation needs strong domain expertise and it is expensive to construct a large database to learn deep models. We propose a novel knowledge-based semi-supervised deep convolutional neural network for AU intensity estimation with extremely limited AU annotations. Only the intensity annotations of peak and valley frames in training sequences are needed. To provide additional supervision for model learning, we exploit naturally existing constraints on AUs, including relative appearance similarity, temporal intensity ordering, facial symmetry, and contrastive appearance difference. Experimental evaluations are performed on two public benchmark databases. With around 2% of intensity annotations in FERA 2015 and around 1% in DISFA for training, our method can achieve comparable or even better performance than the state-of-the-art methods which use 100% of intensity annotations in the training set.

Related Material


[pdf] [supp]
[bibtex]
@InProceedings{Zhang_2018_CVPR,
author = {Zhang, Yong and Dong, Weiming and Hu, Bao-Gang and Ji, Qiang},
title = {Weakly-Supervised Deep Convolutional Neural Network Learning for Facial Action Unit Intensity Estimation},
booktitle = {Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
month = {June},
year = {2018}
}