A StyleCLIP-based Facial Emotion Manipulation Method for Discrepant Emotion Transitions

Qi Guo, Xiaodong Gu; Proceedings of the Asian Conference on Computer Vision (ACCV), 2024, pp. 3414-3431

Abstract


Leveraging StyleCLIP's expressivity and its disentangled latent codes, current methodologies enable facial emotion manipulation through textual inputs. Despite these advancements, significant challenges remain in manipulating target emotions that deviate markedly from the originals without introducing artifacts or errors. This paper introduces a novel approach for discrepant emotion transitions. Our network architecture integrates a StyleGAN2 generator with an Emotion Manipulation Mapper, a Dual Auxiliary Classifier, and a CLIP Text Encoder. By utilizing the inverse cumulative distribution function, we convert source emotion labels into conditional data, thus enhancing the models ability to accurately map and modify the emotional distribution across faces. We evaluate our method against established techniques using the Radboud Faces Database and the CelebA-HQ dataset, and introduced a new quantitative measure including seven metrics for assessing manipulation efficacy.

Related Material


[pdf]
[bibtex]
@InProceedings{Guo_2024_ACCV, author = {Guo, Qi and Gu, Xiaodong}, title = {A StyleCLIP-based Facial Emotion Manipulation Method for Discrepant Emotion Transitions}, booktitle = {Proceedings of the Asian Conference on Computer Vision (ACCV)}, month = {December}, year = {2024}, pages = {3414-3431} }