LIME: Localized Image Editing via Attention Regularization in Diffusion Models

Enis Simsar, Alessio Tonioni, Yongqin Xian, Thomas Hofmann, Federico Tombari; Proceedings of the Winter Conference on Applications of Computer Vision (WACV), 2025, pp. 222-231

Abstract


Diffusion models (DMs) have gained prominence due to their ability to generate high-quality varied images with recent advancements in text-to-image generation. The research focus is now shifting towards the controllability of DMs. A significant challenge within this domain is localized editing where specific areas of an image are modified without affecting the rest of the content. This paper introduces LIME for localized image editing in diffusion models. LIME does not require user-specified regions of interest (RoI) or additional text input but rather employs features from pre-trained methods and a straightforward clustering method to obtain precise editing mask. Then by leveraging cross-attention maps it refines these segments for finding regions to obtain localized edits. Finally we propose a novel cross-attention regularization technique that penalizes unrelated cross-attention scores in the RoI during the denoising steps ensuring localized edits. Our approach without re-training fine-tuning and additional user inputs consistently improves the performance of existing methods in various editing benchmarks. The project page can be found at https://enisimsar.github.io/LIME/.

Related Material


[pdf] [supp] [arXiv]
[bibtex]
@InProceedings{Simsar_2025_WACV, author = {Simsar, Enis and Tonioni, Alessio and Xian, Yongqin and Hofmann, Thomas and Tombari, Federico}, title = {LIME: Localized Image Editing via Attention Regularization in Diffusion Models}, booktitle = {Proceedings of the Winter Conference on Applications of Computer Vision (WACV)}, month = {February}, year = {2025}, pages = {222-231} }