DiG-IN: Diffusion Guidance for Investigating Networks - Uncovering Classifier Differences Neuron Visualisations and Visual Counterfactual Explanations

Maximilian Augustin, Yannic Neuhaus, Matthias Hein; Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024, pp. 11093-11103

Abstract


While deep learning has led to huge progress in complex image classification tasks like ImageNet unexpected failure modes e.g. via spurious features call into question how reliably these classifiers work in the wild. Furthermore for safety-critical tasks the black-box nature of their decisions is problematic and explanations or at least methods which make decisions plausible are needed urgently. In this paper we address these problems by generating images that optimize a classifier-derived objective using a framework for guided image generation. We analyze the decisions of image classifiers by visual counterfactual explanations (VCEs) detection of systematic mistakes by analyzing images where classifiers maximally disagree and visualization of neurons and spurious features. In this way we validate existing observations e.g. the shape bias of adversarially robust models as well as novel failure modes e.g. systematic errors of zero-shot CLIP classifiers. Moreover our VCEs outperform previous work while being more versatile.

Related Material


[pdf] [supp]
[bibtex]
@InProceedings{Augustin_2024_CVPR, author = {Augustin, Maximilian and Neuhaus, Yannic and Hein, Matthias}, title = {DiG-IN: Diffusion Guidance for Investigating Networks - Uncovering Classifier Differences Neuron Visualisations and Visual Counterfactual Explanations}, booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)}, month = {June}, year = {2024}, pages = {11093-11103} }