-
[pdf]
[bibtex]@InProceedings{Fahim_2025_CVPR, author = {Fahim, Masud An Nur Islam and Saqib, Nazmus and Boutellier, Jani}, title = {STAM: Zero-Shot Style Transfer using Diffusion Model via Attention Modulation}, booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops}, month = {June}, year = {2025}, pages = {6332-6342} }
STAM: Zero-Shot Style Transfer using Diffusion Model via Attention Modulation
Abstract
Diffusion models serve as the basis of several different zero-shot image editing applications, including image generation and style transfer. The basic approach in style transfer using diffusion models involves swapping attention components between the provided content and style images. Straightforward interchange of these components can lead to inadequate style injection or loss of content image characteristics. This paper addresses shortcomings of attention-guided style transfer by two novel contributions: a) preserving content via dual path attention aggregation and b) maintaining the impact of style through modulation of attention components. The proposed STAM approach can provide aesthetically appealing yet content-preserving style transfer through a combination of these contributions and is also applicable to prompt-driven style transfer. STAM is validated by extensive qualitative and quantitative evaluations and compared to ten recent works that are largely outperformed by the proposed work. In addition to style transfer quality, STAM is also compared to previous work in terms of inference time and remains close to the fastest competing approaches.
Related Material