Unsupervised Deep Representations for Learning Audience Facial Behaviors

Suman Saha, Rajitha Navarathna, Leonhard Helminger, Romann M. Weber; Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2018, pp. 1132-1137

Abstract


In this paper, we present an unsupervised learning approach for analyzing facial behavior based on a deep generative model combined with a convolutional neural network (CNN). We jointly train a variational auto-encoder (VAE) and a generative adversarial network (GAN) to learn a powerful latent representation from footage of audiences viewing feature-length movies. We show that the learned latent representation successfully encodes meaningful signatures of behaviors related to audience engagement (smiling & laughing) and disengagement (yawning). Our results provide a proof of concept for a more general methodology for annotating hard-to-label multimedia data featuring sparse examples of signals of interest.

Related Material


[pdf] [arXiv]
[bibtex]
@InProceedings{Saha_2018_CVPR_Workshops,
author = {Saha, Suman and Navarathna, Rajitha and Helminger, Leonhard and Weber, Romann M.},
title = {Unsupervised Deep Representations for Learning Audience Facial Behaviors},
booktitle = {Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops},
month = {June},
year = {2018}
}