Foreground Object Search by Distilling Composite Image Feature

Bo Zhang, Jiacheng Sui, Li Niu; Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2023, pp. 22986-22995

Abstract


Foreground object search (FOS) aims to find compatible foreground objects for a given background image, producing realistic composite image. We observe that competitive retrieval performance could be achieved by using a discriminator to predict the compatibility of composite image, but this approach has unaffordable time cost. To this end, we propose a novel FOS method via distilling composite feature (DiscoFOS). Specifically, the abovementioned discriminator serves as teacher network. The student network employs two encoders to extract foreground feature and background feature. Their interaction output is enforced to match the composite image feature from the teacher network. Additionally, previous works did not release their datasets, so we contribute two datasets for FOS task: S-FOSD dataset with synthetic composite images and R-FOSD dataset with real composite images. Extensive experiments on our two datasets demonstrate the superiority of the proposed method over previous approaches. The dataset and code are available at https://github.com/bcmi/Foreground-Object-Search-Dataset-FOSD.

Related Material


[pdf] [supp] [arXiv]
[bibtex]
@InProceedings{Zhang_2023_ICCV, author = {Zhang, Bo and Sui, Jiacheng and Niu, Li}, title = {Foreground Object Search by Distilling Composite Image Feature}, booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)}, month = {October}, year = {2023}, pages = {22986-22995} }