TurboFill: Adapting Few-step Text-to-image Model for Fast Image Inpainting

Xie, Liangbin; Pakhomov, Daniil; Wang, Zhonghao; Wu, Zongze; Chen, Ziyan; Zhou, Yuqian; Zheng, Haitian; Zhang, Zhifei; Lin, Zhe; Zhou, Jiantao; Dong, Chao

Liangbin Xie, Daniil Pakhomov, Zhonghao Wang, Zongze Wu, Ziyan Chen, Yuqian Zhou, Haitian Zheng, Zhifei Zhang, Zhe Lin, Jiantao Zhou, Chao Dong; Proceedings of the Computer Vision and Pattern Recognition Conference (CVPR), 2025, pp. 7613-7622

Abstract

This paper introduces TurboFill, a fast image inpainting model that enhances a few-step text-to-image diffusion model with an inpainting adapter for high-quality and efficient inpainting. While standard diffusion models generate high-quality results, they incur high computational costs. We overcome this by training an inpainting adapter on a few-step distilled text-to-image model, DMD2, using a novel 3-step adversarial training scheme to ensure realistic, structurally consistent, and visually harmonious inpainted regions. To evaluate TurboFill, we propose two benchmarks: DilationBench, which tests performance across mask sizes, and HumanBench, based on human feedback for complex prompts. Experiments show that TurboFill outperforms both multi-step BrushNet and few-step inpainting methods, setting a new benchmark for high-performance inpainting tasks. The project page is available \href https://liangbinxie.github.io/projects/TurboFill/ here

Related Material

[pdf] [supp] [arXiv]

[bibtex]

@InProceedings{Xie_2025_CVPR, author = {Xie, Liangbin and Pakhomov, Daniil and Wang, Zhonghao and Wu, Zongze and Chen, Ziyan and Zhou, Yuqian and Zheng, Haitian and Zhang, Zhifei and Lin, Zhe and Zhou, Jiantao and Dong, Chao}, title = {TurboFill: Adapting Few-step Text-to-image Model for Fast Image Inpainting}, booktitle = {Proceedings of the Computer Vision and Pattern Recognition Conference (CVPR)}, month = {June}, year = {2025}, pages = {7613-7622} }