SemanticStyleGAN: Learning Compositional Generative Priors for Controllable Image Synthesis and Editing

Yichun Shi, Xiao Yang, Yangyue Wan, Xiaohui Shen; Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022, pp. 11254-11264

Abstract


Recent studies have shown that StyleGANs provide promising prior models for downstream tasks on image synthesis and editing. However, since the latent codes of StyleGANs are designed to control global styles, it is hard to achieve a fine-grained control over synthesized images. We present SemanticStyleGAN, where a generator is trained to model local semantic parts separately and synthesizes images in a compositional way. The structure and texture of different local parts are controlled by corresponding latent codes. Experimental results demonstrate that our model provides a strong disentanglement between different spatial areas. When combined with editing methods designed for StyleGANs, it can achieve a more fine-grained control to edit synthesized or real images. The model can also be extended to other domains via transfer learning. Thus, as a generic prior model with built-in disentanglement, it could facilitate the development of GAN-based applications and enable more potential downstream tasks.

Related Material


[pdf] [supp] [arXiv]
[bibtex]
@InProceedings{Shi_2022_CVPR, author = {Shi, Yichun and Yang, Xiao and Wan, Yangyue and Shen, Xiaohui}, title = {SemanticStyleGAN: Learning Compositional Generative Priors for Controllable Image Synthesis and Editing}, booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)}, month = {June}, year = {2022}, pages = {11254-11264} }