Conceptual and Hierarchical Latent Space Decomposition for Face Editing

Savas Ozkan, Mete Ozay, Tom Robinson; Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2023, pp. 7248-7257

Abstract


Generative Adversarial Networks (GANs) can produce photo-realistic results using an unconditional image-generation pipeline. However, the images generated by GANs (e.g., StyleGAN) are entangled in feature spaces, which makes it difficult to interpret and control the contents of images. In this paper, we present an encoder-decoder model that decomposes the entangled GAN space into a conceptual and hierarchical latent space in a self-supervised manner. The outputs of 3D morphable face models are leveraged to independently control image synthesis parameters like pose, expression, and illumination. For this purpose, a novel latent space decomposition pipeline is introduced using transformer networks and generative models. Later, this new space is used to optimize a transformer-based GAN space controller for face editing. In this work, a StyleGAN2 model for faces is utilized. Since our method manipulates only GAN features, the photo-realism of StyleGAN2 is fully preserved. The results demonstrate that our method qualitatively and quantitatively outperforms baselines in terms of identity preservation and editing precision.

Related Material


[pdf] [supp]
[bibtex]
@InProceedings{Ozkan_2023_ICCV, author = {Ozkan, Savas and Ozay, Mete and Robinson, Tom}, title = {Conceptual and Hierarchical Latent Space Decomposition for Face Editing}, booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)}, month = {October}, year = {2023}, pages = {7248-7257} }