ProS: Prompting-to-simulate Generalized knowledge for Universal Cross-Domain Retrieval

Kaipeng Fang, Jingkuan Song, Lianli Gao, Pengpeng Zeng, Zhi-Qi Cheng, Xiyao Li, Heng Tao Shen; Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024, pp. 17292-17301


The goal of Universal Cross-Domain Retrieval (UCDR) is to achieve robust performance in generalized test scenarios wherein data may belong to strictly unknown domains and categories during training. Recently pre-trained models with prompt tuning have shown strong generalization capabilities and attained noteworthy achievements in various downstream tasks such as few-shot learning and video-text retrieval. However applying them directly to UCDR may not be sufficient to handle both domain shift (i.e. adapting to unfamiliar domains) and semantic shift (i.e. transferring to unknown categories). To this end we propose Prompting-to-Simulate (ProS) the first method to apply prompt tuning for UCDR. ProS employs a two-step process to simulate Content-aware Dynamic Prompts (CaDP) which can impact models to produce generalized features for UCDR. Concretely in Prompt Units Learning stage we introduce two Prompt Units to individually capture domain and semantic knowledge in a mask-and-align way. Then in Context-aware Simulator Learning stage we train a Content-aware Prompt Simulator under a simulated test scenario to produce the corresponding CaDP. Extensive experiments conducted on three benchmark datasets show that our method achieves new state-of-the-art performance without bringing excessive parameters. Code is available at

Related Material

[pdf] [arXiv]
@InProceedings{Fang_2024_CVPR, author = {Fang, Kaipeng and Song, Jingkuan and Gao, Lianli and Zeng, Pengpeng and Cheng, Zhi-Qi and Li, Xiyao and Shen, Heng Tao}, title = {ProS: Prompting-to-simulate Generalized knowledge for Universal Cross-Domain Retrieval}, booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)}, month = {June}, year = {2024}, pages = {17292-17301} }