Grounding Counterfactual Explanation of Image Classifiers to Textual Concept Space

Kim, Siwon; Oh, Jinoh; Lee, Sungjin; Yu, Seunghak; Do, Jaeyoung; Taghavi, Tara

Siwon Kim, Jinoh Oh, Sungjin Lee, Seunghak Yu, Jaeyoung Do, Tara Taghavi; Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2023, pp. 10942-10950

Abstract

Concept-based explanation aims to provide concise and human-understandable explanations of an image classifier. However, existing concept-based explanation methods typically require a significant amount of manually collected concept-annotated images. This is costly and runs the risk of human biases being involved in the explanation. In this paper, we propose counterfactual explanation with text-driven concepts (CounTEX), where the concepts are defined only from text by leveraging a pre-trained multi-modal joint embedding space without additional concept-annotated datasets. A conceptual counterfactual explanation is generated with text-driven concepts. To utilize the text-driven concepts defined in the joint embedding space to interpret target classifier outcome, we present a novel projection scheme for mapping the two spaces with a simple yet effective implementation. We show that CounTEX generates faithful explanations that provide a semantic understanding of model decision rationale robust to human bias.

Related Material

[pdf] [supp]

[bibtex]

@InProceedings{Kim_2023_CVPR, author = {Kim, Siwon and Oh, Jinoh and Lee, Sungjin and Yu, Seunghak and Do, Jaeyoung and Taghavi, Tara}, title = {Grounding Counterfactual Explanation of Image Classifiers to Textual Concept Space}, booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)}, month = {June}, year = {2023}, pages = {10942-10950} }