CLIPN for Zero-Shot OOD Detection: Teaching CLIP to Say No

Hualiang Wang, Yi Li, Huifeng Yao, Xiaomeng Li; Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2023, pp. 1802-1812

Abstract


Out-of-distribution (OOD) detection refers to training the model on in-distribution (ID) dataset to classify if the input images come from unknown classes. Considerable efforts have been invested in designing various OOD detection methods based on either convolutional neural networks or transformers. However, Zero-shot OOD detection methods driven by CLIP, which require only class names for ID, have received less attention. This paper presents a novel method, namely CLIP saying no (CLIPN), which empowers "no" logic within CLIP. Our key motivation is to equip CLIP with the capability of distinguishing OOD and ID samples via positive-semantic prompts and negation-semantic prompts. To be specific, we design a novel learnable "no" prompt and a "no" text encoder to capture the negation-semantic with images. Subsequently, we introduce two loss functions: the image-text binary-opposite loss and the text semantic-opposite loss, which we use to teach CLIPN to associate images with "no" prompts, thereby enabling it to identify unknown samples. Furthermore, we propose two threshold-free inference algorithms to perform OOD detection via using negation semantics from "no" prompts and text encoder. Experimental results on 9 benchmark datasets (3 ID datasets and 6 OOD datasets) for the OOD detection task demonstrate that CLIPN outperforms 7 well-used algorithms by at least 1.1% and 7.37% on AUROC and FPR95 on zero-shot OOD detection of ImageNet-1K. Our CLIPN can serve as a solid foundation for leveraging CLIP effectively in downstream OOD tasks.

Related Material


[pdf] [arXiv]
[bibtex]
@InProceedings{Wang_2023_ICCV, author = {Wang, Hualiang and Li, Yi and Yao, Huifeng and Li, Xiaomeng}, title = {CLIPN for Zero-Shot OOD Detection: Teaching CLIP to Say No}, booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)}, month = {October}, year = {2023}, pages = {1802-1812} }