FAN-Trans: Online Knowledge Distillation for Facial Action Unit Detection

Jing Yang, Jie Shen, Yiming Lin, Yordan Hristov, Maja Pantic; Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), 2023, pp. 6019-6027

Abstract


Due to its importance in facial behaviour analysis, facial action unit (AU) detection has attracted increasing attention from the research community. Leveraging the online knowledge distillation framework, we propose the "FAN-Trans" method for AU detection. Our model consists of a hybrid network of convolution layers and transformer blocks designed to learn per-AU features and to model AU co-occurrences. The model uses a pre-trained face alignment network as the feature extractor. After further transformation by a small learnable add-on convolutional subnet, the per-AU features are fed into transformer blocks to enhance their representation. As multiple AUs often appear together, we propose a learnable attention drop mechanism in the transformer block to learn the correlation between the features for different AUs. We also design a classifier that predicts AU presence by considering all AUs' features, to explicitly capture label dependencies. Finally, we make the first attempt of adapting online knowledge distillation in the training stage for this task, further improving the model's performance. Experiments on the BP4D and DISFA datasets show our method has achieved a new state-of-the-art performance on both, demonstrating its effectiveness.

Related Material


[pdf] [supp]
[bibtex]
@InProceedings{Yang_2023_WACV, author = {Yang, Jing and Shen, Jie and Lin, Yiming and Hristov, Yordan and Pantic, Maja}, title = {FAN-Trans: Online Knowledge Distillation for Facial Action Unit Detection}, booktitle = {Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)}, month = {January}, year = {2023}, pages = {6019-6027} }