ContextFace: Generating Facial Expressions from Emotional Contexts

Kim, Min-jung; Kim, Minsang; Baek, Seung Jun

Min-jung Kim, Minsang Kim, Seung Jun Baek; Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2025, pp. 11383-11392

Abstract

The task of generating 3D facial expressions given various situational contexts is important for applications such as virtual avatars or human-robot interactions. The task is, however, challenging not only because it requires a comprehensive understanding of emotion, expression and contexts, but also there rarely are datasets to support the task. We propose ContextFace, a Multi-modal Large Language Model (MLLM) fine-tuned to generate 3D facial expressions depending on complex situational contexts. To overcome the lack of datasets, we perform a context augmentation to existing emotion recognition datasets; we generate plausible situations and quotes from images and emotions to annotate the dataset. Next, we perform visual instruction tuning of MLLMs on context-augmented datasets to boost their capability of visual synthesis from emotions. Experiments show a superior performance of ContextFace in the zero-shot evaluation of contextual emotion recognition. A qualitative evaluation shows that our method generates expressions consistent with diverse contexts and performs complex emotion reasoning, e.g., speculative generation of expressions of occluded faces through interactive prompting.

Related Material

[pdf] [supp]

[bibtex]

@InProceedings{Kim_2025_ICCV, author = {Kim, Min-jung and Kim, Minsang and Baek, Seung Jun}, title = {ContextFace: Generating Facial Expressions from Emotional Contexts}, booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)}, month = {October}, year = {2025}, pages = {11383-11392} }