IMPRINT: Generative Object Compositing by Learning Identity-Preserving Representation

Song, Yizhi; Zhang, Zhifei; Lin, Zhe; Cohen, Scott; Price, Brian; Zhang, Jianming; Kim, Soo Ye; Zhang, He; Xiong, Wei; Aliaga, Daniel

Yizhi Song, Zhifei Zhang, Zhe Lin, Scott Cohen, Brian Price, Jianming Zhang, Soo Ye Kim, He Zhang, Wei Xiong, Daniel Aliaga; Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024, pp. 8048-8058

Abstract

Generative object compositing emerges as a promising new avenue for compositional image editing. However the requirement of object identity preservation poses a significant challenge limiting practical usage of most existing methods. In response this paper introduces IMPRINT a novel diffusion-based generative model trained with a two-stage learning framework that decouples learning of identity preservation from that of compositing. The first stage is targeted for context-agnostic identity-preserving pretraining of the object encoder enabling the encoder to learn an embedding that is both view-invariant and conducive to enhanced detail preservation. The subsequent stage leverages this representation to learn seamless harmonization of the object composited to the background. In addition IMPRINT incorporates a shape-guidance mechanism offering user-directed control over the compositing process. Extensive experiments demonstrate that IMPRINT significantly outperforms existing methods and various baselines on identity preservation and composition quality.

Related Material

[pdf] [supp] [arXiv]

[bibtex]

@InProceedings{Song_2024_CVPR, author = {Song, Yizhi and Zhang, Zhifei and Lin, Zhe and Cohen, Scott and Price, Brian and Zhang, Jianming and Kim, Soo Ye and Zhang, He and Xiong, Wei and Aliaga, Daniel}, title = {IMPRINT: Generative Object Compositing by Learning Identity-Preserving Representation}, booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)}, month = {June}, year = {2024}, pages = {8048-8058} }