A Temporal Encoder-Decoder Approach to Extracting Blood Volume Pulse Signal Morphology From Face Videos
This paper considers methods for extracting blood volume pulse (BVP) representations from video of the human face. Whereas most previous systems have been concerned with estimating vital signs such as average heart rate, this paper addresses the more difficult problem of recovering BVP signal morphology. We present a new approach that is inspired by temporal encoder-decoder architectures that have been used for audio signal separation. As input, this system accepts a temporal sequence of RGB (red, green, blue) values that have been spatially averaged over a small portion of the face. The output of the system is a temporal sequence that approximates a BVP signal. In order to reduce noise in the recovered signal, a separate processing step extracts individual pulses and performs normalization and outlier removal. After these steps, individual pulse shapes have been extracted that are sufficiently distinct to support biometric authentication. Our findings demonstrate the effectiveness of our approach in extracting BVP signal morphology from facial videos, which presents exciting opportunities for further research in this area. The source code is available at https://github.com/Adleof/CVPM-2023-Temporal-Encoder-Decoder-iPPG.