Dynamic Facial Models for Video-Based Dimensional Affect Estimation

Siyang Song, Enrique Sanchez-Lozano, Mani Kumar Tellamekala, Linlin Shen, Alan Johnston, Michel Valstar; Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2019, pp. 0-0


Dimensional affect estimation from a face video is a challenging task, mainly due to the large number of possible facial displays made up of a set of behaviour primitives including facial muscle actions. The displays vary not only in composition but also in temporal evolution, with each display composed of behaviour primitives with varying in their short and long-term characteristics. Most existing work models affect relies on complex hierarchical recurrent models unable to capture short-term dynamics well. In this paper, we propose to encode these short-term facial shape and appearance dynamics in an image, where only the semantic meaningful information is encoded into the dynamic face images. We also propose binary dynamic facial masks to remove 'stable pixels' from the dynamic images. This process allows filtering of non-dynamic information, i.e. only pixels that have changed in the sequence are retained. Then, the final proposed Dynamic Facial Model (DFM) encodes both filtered facial appearance and shape dynamics of a image sequence preceding to the given frame into a three-channel raster image. A CNN-RNN architecture is tasked with modelling primarily the long-term changes. Experiments show that our dynamic face images achieved superior performance over the standard RGB face images on dimensional affect prediction task.

Related Material

author = {Song, Siyang and Sanchez-Lozano, Enrique and Kumar Tellamekala, Mani and Shen, Linlin and Johnston, Alan and Valstar, Michel},
title = {Dynamic Facial Models for Video-Based Dimensional Affect Estimation},
booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) Workshops},
month = {Oct},
year = {2019}