Negation Matters: Training-Free Negation-Aware Image Retrieval

Aashish Pokhrel, Shivanand Venkanna Sheshappanavar; Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2026, pp. 11353-11362

Abstract


Negation--the linguistic ability to assert the absence of a concept--is a critical bottleneck in vision-language understanding. Vision-language models have demonstrated remarkable success in aligning visual and textual representation across domains such as content moderation, medical image retrieval, and natural language-guided search. Yet, these models consistently fail to handle negation robustly, often retrieving images that contain the very concept that was explicitly excluded. Existing methods address this limitation through fine-tuning on synthetic negation corpora--an approach that is computationally expensive, dataset-dependent, and prone to compromising generalization to unseen distributions. We propose SpaceVLM-DRC, which introduces Dynamic Repulsion with Context Anchoring (DRC) into the SpaceVLM framework as a training-free inference time negation resolution mechanism over a frozen CLIP backbone. This framework first decomposes queries into affirmative, negated, and counterfactual components, then applies dynamic repulsion to push negated concepts away in the embedding space, and finally anchors retrieval within the full-caption context to preserve semantic coherence. SpaceVLM-DRC surpasses state-of-the-art results on the MSRVTT negation retrieval benchmark and achieves performance comparable to fully fine-tuned approaches on the COCO negated retrieval dataset. Crucially, it requires no model retraining while preserving zero-shot generalization on non-negated queries. Our code is available at https://github.com/aashishpokhrel27/spacevlm-drc.

Related Material


[pdf]
[bibtex]
@InProceedings{Pokhrel_2026_CVPR, author = {Pokhrel, Aashish and Sheshappanavar, Shivanand Venkanna}, title = {Negation Matters: Training-Free Negation-Aware Image Retrieval}, booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops}, month = {June}, year = {2026}, pages = {11353-11362} }