MANet: A Motion-Driven Attention Network for Detecting the Pulse From a Facial Video With Drastic Motions
Video Photoplethysmography (VPPG) technique can detect pulse signals from facial videos, becoming increasingly popular due to its convenience and low cost. However, it fails to be sufficiently robust to drastic motion disturbances such as continuous head movements in our real life. A motion-driven attention network (MANet) is proposed in this paper to improve its motion robustness. MANet takes the frequency spectrum of a skin color signal and of a synchronous nose motion signal as the inputs, following by removing the motion features out of the skin color signal using an attention mechanism driven by the nose motion signal. Thus, it predicts frequency spectrum without components resulting from motion disturbances, which is finally transformed back to a pulse signal. MANet is tested on 1000 samples of 200 subjects provided by the 2nd Remote Physiological Signal Sensing (RePSS) Challenge. It achieves a mean inter-beat-interval (IBI) error of 122.80 milliseconds and a mean heart rate error of 7.29 beats per minute.