-
[pdf]
[supp]
[bibtex]@InProceedings{Kumar_2025_CVPR, author = {Kumar, Vishesh and Agarwal, Akshay}, title = {A Unified, Resilient, and Explainable Adversarial Patch Detector}, booktitle = {Proceedings of the Computer Vision and Pattern Recognition Conference (CVPR)}, month = {June}, year = {2025}, pages = {30387-30397} }
A Unified, Resilient, and Explainable Adversarial Patch Detector
Abstract
Deep Neural Networks (DNNs), backbone architecture in `almost' every computer vision task, are vulnerable to adversarial attacks, particularly physical out-of-distribution (OOD) adversarial patches. Existing defense models often struggle with interpreting these attacks in ways that align with human visual perception. Our proposed AdvPatchXAI approach introduces a generalized, robust, and explainable defense algorithm designed to defend DNNs against physical adversarial threats. AdvPatchXAI employs a novel patch decorrelation loss that reduces feature redundancy and enhances the distinctiveness of patch representations, enabling better generalization across unseen adversarial scenarios. It learns prototypical parts self-supervised, enhancing interpretability and correlation with human vision. The model utilizes a sparse linear layer for classification, making the decision process globally interpretable through a set of learned prototypes and locally explainable by pinpointing relevant prototypes within an image. Our comprehensive evaluation shows that AdvPatchXAI closes the "semantic" gap between latent space and pixel space and effectively handles unseen adversarial patches even perturbed with unseen corruptions, thereby significantly advancing DNN robustness in practical settings(https://github.com/tbvl22/Unified-resilient-and-Explainable-Adversarial-Patch-detector).
Related Material