Multi-Task Learning for Simultaneous Video Generation and Remote Photoplethysmography Estimation

Yun-Yun Tsou, Yi-An Lee, Chiou-Ting Hsu; Proceedings of the Asian Conference on Computer Vision (ACCV), 2020

Abstract


Remote photoplethysmography (rPPG) is a contactless method for estimating physiological signals from facial videos. Without large supervised datasets, learning a robust rPPG estimation model is extremely challenging. Instead of merely focusing on model learning, we believe data augmentation may be of greater importance for this task. In this paper, we propose a novel multi-task learning framework to simultaneously augment training data while learning the rPPG estimation model. We design three joint-learning networks: rPPG estimation network, Image-to-Video network, and Video-to-Video network, to estimate rPPG signals from face videos, to generate synthetic videos from a source image and a specified rPPG signal, and to generate synthetic videos from a source video and a specified rPPG signal, respectively. Experimental results on three benchmark datasets, COHFACE, UBFC, and PURE, show that our method successfully generates photo-realistic videos and significantly outperforms existing methods with a large margin.

Related Material


[pdf] [supp] [code]
[bibtex]
@InProceedings{Tsou_2020_ACCV, author = {Tsou, Yun-Yun and Lee, Yi-An and Hsu, Chiou-Ting}, title = {Multi-Task Learning for Simultaneous Video Generation and Remote Photoplethysmography Estimation}, booktitle = {Proceedings of the Asian Conference on Computer Vision (ACCV)}, month = {November}, year = {2020} }