RL-CAM: Visual Explanations for Convolutional Networks Using Reinforcement Learning

Sarkar, Soumyendu; Babu, Ashwin Ramesh; Mousavi, Sajad; Ghorbanpour, Sahand; Gundecha, Vineet; Guillen, Antonio; Luna, Ricardo; Naug, Avisek

Soumyendu Sarkar, Ashwin Ramesh Babu, Sajad Mousavi, Sahand Ghorbanpour, Vineet Gundecha, Antonio Guillen, Ricardo Luna, Avisek Naug; Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2023, pp. 3861-3869

Abstract

Convolutional Neural Networks (CNNs) are state-of-the-art models for computer vision tasks such as image classification, object detection, and segmentation. However, these models suffer from their inability to explain decisions, particularly in fields like healthcare and security, where interpretability is critical. Previous research has developed various methods for interpreting CNNs, including visualization-based approaches (e.g., saliency maps) that aim to reveal the underlying features used by the model to make predictions. In this work, we propose a novel approach that uses reinforcement learning to generate a visual explanation for CNNs. Our method considers the black-box CNN model and relies solely on the probability distribution of the model's output to localize the features contributing to a particular prediction. The proposed reinforcement learning algorithm has an agent with two actions, a forward action that explores the input image and identifies the most sensitive region to generate a localization mask and a reverse action that fine-tunes the localization mask. We evaluate the performance of our approach using multiple image segmentation metrics and compare it with existing visualization-based methods. The experimental results demonstrate that our proposed method outperforms the existing techniques, producing more accurate localization masks of regions of interest in the input images.

Related Material

[pdf]

[bibtex]

@InProceedings{Sarkar_2023_CVPR, author = {Sarkar, Soumyendu and Babu, Ashwin Ramesh and Mousavi, Sajad and Ghorbanpour, Sahand and Gundecha, Vineet and Guillen, Antonio and Luna, Ricardo and Naug, Avisek}, title = {RL-CAM: Visual Explanations for Convolutional Networks Using Reinforcement Learning}, booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops}, month = {June}, year = {2023}, pages = {3861-3869} }