Robust Eye Blink Detection Using Dual Embedding Video Vision Transformer
Eye blink detection serves as a crucial biomarker for evaluating both physical and mental states, garnering considerable attention in biometric and video-based studies. Among various methods, video-based eye blink detection has been particularly favored due to its non-invasive nature, enabling broader applications. However, capturing eye blinks from different camera angles poses significant challenges, primarily because the eye region is relatively small and eye blinks occur rapidly, necessitating a robust detection algorithm. To address these challenges, we introduce Dual Embedding Video Vision Transformer (DE-ViViT), a novel approach for eye blink detection that employs two different embedding strategies: (i) tubelet embedding and (ii) residual embedding. Each embedding can capture large and subtle changes within the eye movement sequence respectively. We rigorously evaluate our proposed method using HUST-LEBW, a publicly available dataset, as well as our newly collected multi-angle eye blink dataset (MAEB). The results indicate that the proposed model consistently outperforms existing methods across both datasets, with notably minor performance variations depending on the camera angles.