Text-Guided 3D Face Synthesis - From Generation to Editing

Wu, Yunjie; Meng, Yapeng; Hu, Zhipeng; Li, Lincheng; Wu, Haoqian; Zhou, Kun; Xu, Weiwei; Yu, Xin

Yunjie Wu, Yapeng Meng, Zhipeng Hu, Lincheng Li, Haoqian Wu, Kun Zhou, Weiwei Xu, Xin Yu; Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024, pp. 1260-1269

Abstract

Text-guided 3D face synthesis has achieved remarkable results by leveraging text-to-image (T2I) diffusion models. However most existing works focus solely on the direct generation ignoring the editing restricting them from synthesizing customized 3D faces through iterative adjustments. In this paper we propose a unified text-guided framework from face generation to editing. In the generation stage we propose a geometry-texture decoupled generation to mitigate the loss of geometric details caused by coupling. Besides decoupling enables us to utilize the generated geometry as a condition for texture generation yielding highly geometry-texture aligned results. We further employ a fine-tuned texture diffusion model to enhance texture quality in both RGB and YUV space. In the editing stage we first employ a pre-trained diffusion model to update facial geometry or texture based on the texts. To enable sequential editing we introduce a UV domain consistency preservation regularization preventing unintentional changes to irrelevant facial attributes. Besides we propose a self-guided consistency weight strategy to improve editing efficacy while preserving consistency. Through comprehensive experiments we showcase our method's superiority in face synthesis.

Related Material

[pdf] [supp]

[bibtex]

@InProceedings{Wu_2024_CVPR, author = {Wu, Yunjie and Meng, Yapeng and Hu, Zhipeng and Li, Lincheng and Wu, Haoqian and Zhou, Kun and Xu, Weiwei and Yu, Xin}, title = {Text-Guided 3D Face Synthesis - From Generation to Editing}, booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)}, month = {June}, year = {2024}, pages = {1260-1269} }