Learning Interval-Aware Embedding for Macro- and Micro-expression Spotting

Xiaodong Li, Jiajun Li, Wenchao Du, Hu Chen, Hongyu Yang; Proceedings of the Asian Conference on Computer Vision (ACCV), 2024, pp. 337-353

Abstract


Spotting the start and end frames of macro- and micro-expression in untrimmed long videos(i.e. Macro- and Micro-Expression Spotting, shorted by M^2ES) is extremely challenging due to the significant interval scale variations. Leading works borrowed the idea of "anchor" from temporal action localization into M^2ES, and achieved great improvements because of the finer proposal generation. However, covering diverse intervals is challenging for anchor-based methods due to latent domain shifts between macro- and micro-expression instances. Instead, we propose a purely anchor-free method for M^2ES, which eliminates the setting of redundant hyperparameters, and is both efficient and effective. In this work, we explore an Interval-aware Embedding Network (IAENet), which first exploits a basic two-stream network as the backbone to extract spatial and temporal feature embeddings from videos and optical flows, then a carefully designed temporal pyramid module is used to process interval-specific macro- and micro-expression instances in a parallel manner through a novel temporal attention mechanism and cross-scale feature fusion modules. We further design an interval-aware proposal generation scheme to specialize each spotting branch by sampling instances of proper intervals during training and inference. Extensive experiments demonstrate that our method beats all existing technologies, including interval-based and frame-based methods, with state-of-the-art results on the CAS(ME)^2 dataset and competitive results on the SAMM-LV dataset.

Related Material


[pdf]
[bibtex]
@InProceedings{Li_2024_ACCV, author = {Li, Xiaodong and Li, Jiajun and Du, Wenchao and Chen, Hu and Yang, Hongyu}, title = {Learning Interval-Aware Embedding for Macro- and Micro-expression Spotting}, booktitle = {Proceedings of the Asian Conference on Computer Vision (ACCV)}, month = {December}, year = {2024}, pages = {337-353} }