Explainability for Content-Based Image Retrieval

Bo Dong, Roddy Collins, Anthony Hoogs; Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2019, pp. 95-98


We discuss how the concept of "explainability" may be applied to Content-Based Image Retrieval (CBIR) systems. CBIR typically transforms an image into a feature representation for which a similarity distance metric may be computed; recent systems have improved performance by using features from deep learning networks [11, 6, 3]. However, as these representations have no direct semantic interpretability, the behavior of the system can be difficult for the user to understand in terms of semantically significant objects in the scene which may have no significant presence in the feature representation. Conversely, the similarity metric for two images may be dominated by pixel content which is not the semantic focus of the images, such as the background. We propose Similarity Based Saliency Maps (SBSM) to illustrate which areas in an image the CBIR system uses when retrieving and ranking results; the SBSM thus serves to "explain" the CBIR's decisions to the user. We have implemented SBSMs in our opensource Social Media Query Toolkit (SMQTK) [4], and have conducted preliminary user studies to demonstrate that SBSMs allow the user to more efficiently retrieve images.

Related Material

author = {Dong, Bo and Collins, Roddy and Hoogs, Anthony},
title = {Explainability for Content-Based Image Retrieval},
booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops},
month = {June},
year = {2019}