RMFER: Semi-Supervised Contrastive Learning for Facial Expression Recognition With Reaction Mashup Video

Yunseong Cho, Chanwoo Kim, Hoseong Cho, Yunhoe Ku, Eunseo Kim, Muhammadjon Boboev, Joonseok Lee, Seungryul Baek; Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), 2024, pp. 5913-5922

Abstract


Facial expression recognition (FER) has greatly benefited from deep learning but still faces challenges in dataset collection due to the nuanced nature of facial expressions. In this study, we present a novel unlabeled dataset and semi-supervised contrastive learning framework that utilizes Reaction Mashup (RM) videos, a video that includes multiple individuals reacting to the same film. We created a Reaction Mashup dataset (RMset) from these videos. Our framework integrates three distinct modules: A classification module for supervised facial expression categorization, an attention module for inter-sample attention learning, and a contrastive module for attention-based contrastive learning using RMset. We utilize both the classification and attention modules for the initial training, subsequently incorporating the contrastive module to enhance the learning process. Our experiments demonstrate that our method improves feature learning and outperforms state-of-the-art models on three benchmark FER datasets. Codes are available at https://github.com/yunseongcho/RMFER.

Related Material


[pdf] [supp]
[bibtex]
@InProceedings{Cho_2024_WACV, author = {Cho, Yunseong and Kim, Chanwoo and Cho, Hoseong and Ku, Yunhoe and Kim, Eunseo and Boboev, Muhammadjon and Lee, Joonseok and Baek, Seungryul}, title = {RMFER: Semi-Supervised Contrastive Learning for Facial Expression Recognition With Reaction Mashup Video}, booktitle = {Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)}, month = {January}, year = {2024}, pages = {5913-5922} }