LaDiffGAN: Training GANs with Diffusion Supervision in Latent Spaces

Xuhui Liu, Bohan Zeng, Sicheng Gao, Shanglin Li, Yutang Feng, Hong Li, Boyu Liu, Jianzhuang Liu, Baochang Zhang; Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2024, pp. 1115-1125

Abstract


Diffusion models have recently become increasingly popular in a number of computer vision tasks but they fail to achieve satisfactory results for unsupervised image-to-image translation since they require massive training data and rely heavily on extra guidance. In this scenario GANs can alleviate these issues existing in diffusion models albeit with suboptimal quality. In this paper we leverage the advantages of both GANs and diffusion models by training GANs with diffusion supervision in latent spaces (LaDiffGAN) to solve the unsupervised image-to-image translation task. Firstly to promote style transfer quality we encode the data in specific latent spaces with styles of the target and source domains. Secondly we introduce the diffusion process with different amounts of Gaussian noise to enhance the modeling capability of GANs on the complex data distribution. We accordingly design a latent diffusion GAN loss to align the latent features between generated and training images. Lastly we introduce a heterogeneous conditional denoising loss that incorporates image-level supervision to further improve the quality of generated results. Our LaDiffGAN significantly alleviates the drawbacks associated with diffusion models such as data leakage high inference cost and high dependence on large training data sets. Extensive experiments show that LaDiffGAN outperforms previous GAN models and delivers comparable or even better performance than diffusion models.

Related Material


[pdf] [supp]
[bibtex]
@InProceedings{Liu_2024_CVPR, author = {Liu, Xuhui and Zeng, Bohan and Gao, Sicheng and Li, Shanglin and Feng, Yutang and Li, Hong and Liu, Boyu and Liu, Jianzhuang and Zhang, Baochang}, title = {LaDiffGAN: Training GANs with Diffusion Supervision in Latent Spaces}, booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops}, month = {June}, year = {2024}, pages = {1115-1125} }