-
[pdf]
[supp]
[bibtex]@InProceedings{Zhao_2025_WACV, author = {Zhao, Jingjiao and Li, Jiaju and Lian, Dongze and Sun, Liguo and Lv, Pin}, title = {DualCIR: Enhancing Training-Free Composed Image Retrieval via Dual-Directional Descriptions}, booktitle = {Proceedings of the Winter Conference on Applications of Computer Vision (WACV)}, month = {February}, year = {2025}, pages = {5926-5936} }
DualCIR: Enhancing Training-Free Composed Image Retrieval via Dual-Directional Descriptions
Abstract
Integrating language and images for Composed Image Retrieval (CIR) allows for a flexible representation of search intent making it a focal point in multi-modal research. Traditional CIR methods train models on complex triplet datasets. In contrast Zero-Shot Composed Image Retrieval (ZS-CIR) eliminates the need for constructing datasets and training models for each specific task attracting significant attention from researchers. CIReVL is a training-free method that translates visual content into textual representations employing large language models (LLMs) to perform robust text inference and visual language models (VLMs) to ensure multi-modal alignment during retrieval. Although this approach achieves high performance without training and offers strong interpretability relying solely on single descriptive texts for retrieval has limitations in handling complex search demands. To address this issue we propose a DualCIR framework which utilizes an LLM to generate refined dual-directional descriptions--both positive and negative--alongside generating holistic target descriptions. We score retrieval data separately via these descriptions merge the scores perform ranking and ultimately achieve retrieval. Extensive experiments across three datasets in the natural image and fashion domains show that our approach maintains low inference costs while noticeably enhancing retrieval effectiveness. Compared to existing training-free methods our DualCIR achieves state-of-the-art performance and is even competitive with the training-based methods.
Related Material