-
[pdf]
[supp]
[arXiv]
[bibtex]@InProceedings{Kim_2025_CVPR, author = {Kim, Won Jun and Chung, Hyungjin and Kim, Jaemin and Lee, Sangmin and Sim, Byeongsu and Ye, Jong Chul}, title = {Derivative-Free Diffusion Manifold-Constrained Gradient for Unified XAI}, booktitle = {Proceedings of the Computer Vision and Pattern Recognition Conference (CVPR)}, month = {June}, year = {2025}, pages = {23795-23805} }
Derivative-Free Diffusion Manifold-Constrained Gradient for Unified XAI
Abstract
Gradient-based methods are a prototypical family of "explainability for AI" (XAI) techniques, especially for image-based models. However, they (1) require white-box access to models, (2) are vulnerable to adversarial attacks, and (3) produce attributions that lie off the image manifold, leading to explanations that are not amenable to human perception. To overcome these challenges, we introduce Derivative-Free Diffusion Manifold-Contrained Gradients (FreeMCG): by leveraging ensemble Kalman filters and diffusion models, we derive a derivative-free approximation of the model's gradient projected onto the data manifold, requiring access only to the model's outputs (i.e., black-box setting). We demonstrate the effectiveness of FreeMCG by applying it to both counterfactual generation and feature attribution, which have traditionally been treated as different tasks requiring distinct methods. Through comprehensive evaluation on both counterfactual explanation and feature attribution we show that our method yields state-of-the-art results for both tasks while preserving the essential properties expected of XAI tools. Code: https://github.com/one-june/FreeMCG.
Related Material