ArtAdapter: Text-to-Image Style Transfer using Multi-Level Style Encoder and Explicit Adaptation

Dar-Yen Chen, Hamish Tennent, Ching-Wen Hsu; Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024, pp. 8619-8628

Abstract


This work introduces ArtAdapter a transformative text-to-image (T2I) style transfer framework that transcends traditional limitations of color brushstrokes and object shape capturing high-level style elements such as composition and distinctive artistic expression. The integration of a multi-level style encoder with our proposed explicit adaptation mechanism enables ArtAdapter to achieve unprecedented fidelity in style transfer ensuring close alignment with textual descriptions. Additionally the incorporation of an Auxiliary Content Adapter (ACA) effectively separates content from style alleviating the borrowing of content from style references. Moreover our novel fast finetuning approach could further enhance zero-shot style representation while mitigating the risk of overfitting. Comprehensive evaluations confirm that ArtAdapter surpasses current state-of-the-art methods.

Related Material


[pdf] [supp] [arXiv]
[bibtex]
@InProceedings{Chen_2024_CVPR, author = {Chen, Dar-Yen and Tennent, Hamish and Hsu, Ching-Wen}, title = {ArtAdapter: Text-to-Image Style Transfer using Multi-Level Style Encoder and Explicit Adaptation}, booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)}, month = {June}, year = {2024}, pages = {8619-8628} }