Relational Edge-Node Graph Attention Network for Classification of Micro-Expressions
Facial micro-expressions (MEs) refer to subtle, transient, and involuntary muscle movements expressing a person's true feelings. This paper presents a novel two-stream relational edge-node graph attention network-based approach to classify MEs in a video by selecting the high-intensity frames and edge-node features that can provide valuable information about the relationship between nodes and structural information in a graph structure. The paper examines the impact of different edge-node features and their relationships on the graphs. The first step involves extracting high-intensity-emotion frames from the video using optical flow. Second, node feature embeddings are calculated using the node location coordinate features and the patch size information of the optical flow across each node location. Additionally, we obtain the global and local structural similarity score using the jaccard's similarity score and radial basis function as the edge features. Third, a self-attention graph pooling layer helps to remove the nodes with lower attention scores based on the top-k selection. As the final step, the network employs a two-stream edge-node graph attention network that focuses on finding correlations among the edge and node features, such as landmark coordinates, optical flow, and global and local edge features. A three-frame graph structure is designed to obtain spatio-temporal information. For 3 and 5 expression classes, the results are compared for SMIC and CASME II databases.