Brush2Prompt: Contextual Prompt Generator for Object Inpainting

Mang Tik Chiu, Yuqian Zhou, Lingzhi Zhang, Zhe Lin, Connelly Barnes, Sohrab Amirghodsi, Eli Shechtman, Humphrey Shi; Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024, pp. 12636-12645

Abstract


Object inpainting is a task that involves adding objects to real images and seamlessly compositing them. With the recent commercialization of products like Stable Diffusion and Generative Fill inserting objects into images by using prompts has achieved impressive visual results. In this paper we propose a prompt suggestion model to simplify the process of prompt input. When the user provides an image and a mask our model predicts suitable prompts based on the partial contextual information in the masked image and the shape and location of the mask. Specifically we introduce a concept-diffusion in the CLIP space that predicts CLIP-text embeddings from a masked image. These diffused embeddings can be directly injected into open-source inpainting models like Stable Diffusion and its variants. Alternatively they can be decoded into natural language for use in other publicly available applications such as Generative Fill. Our prompt suggestion model demonstrates a balanced accuracy and diversity showing its capability to be both contextually aware and creatively adaptive.

Related Material


[pdf] [supp]
[bibtex]
@InProceedings{Chiu_2024_CVPR, author = {Chiu, Mang Tik and Zhou, Yuqian and Zhang, Lingzhi and Lin, Zhe and Barnes, Connelly and Amirghodsi, Sohrab and Shechtman, Eli and Shi, Humphrey}, title = {Brush2Prompt: Contextual Prompt Generator for Object Inpainting}, booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)}, month = {June}, year = {2024}, pages = {12636-12645} }