Zero-Shot Contrastive Loss for Text-Guided Diffusion Image Style Transfer

Yang, Serin; Hwang, Hyunmin; Ye, Jong Chul

Serin Yang, Hyunmin Hwang, Jong Chul Ye; Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2023, pp. 22873-22882

Abstract

Diffusion models have shown great promise in text-guided image style transfer, but there is a trade-off between style transformation and content preservation due to their stochastic nature. Existing methods require computationally expensive fine-tuning of diffusion models or additional neural network. To address this, here we propose a zero-shot contrastive loss for diffusion models that doesn't require additional fine-tuning or auxiliary networks. By leveraging patch-wise contrastive loss between generated samples and original image embeddings in the pre-trained diffusion model, our method can generate images with the same semantic content as the source image in a zero-shot manner. Our approach outperforms existing methods while preserving content and requiring no additional training, not only for image style transfer but also for image-to-image translation and manipulation. Our experimental results validate the effectiveness of our proposed method.

Related Material

[pdf] [supp] [arXiv]

[bibtex]

@InProceedings{Yang_2023_ICCV, author = {Yang, Serin and Hwang, Hyunmin and Ye, Jong Chul}, title = {Zero-Shot Contrastive Loss for Text-Guided Diffusion Image Style Transfer}, booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)}, month = {October}, year = {2023}, pages = {22873-22882} }