LFI-CAM: Learning Feature Importance for Better Visual Explanation

Kwang Hee Lee, Chaewon Park, Junghyun Oh, Nojun Kwak; Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2021, pp. 1355-1363

Abstract


Class Activation Mapping (CAM) is a powerful technique used to understand the decision making of Convolutional Neural Network (CNN) in computer vision. Recently, there have been attempts not only to generate better visual explanations, but also to improve classification performance using visual explanations. However, previous works still have their own drawbacks. In this paper, we propose a novel architecture, LFI-CAM***(Learning Feature Importance Class Activation Mapping), which is trainable for image classification and visual explanation in an end-to-end manner. LFI-CAM generates attention map for visual explanation during forward propagation, and simultaneously uses attention map to improve classification performance through the attention mechanism. Feature Importance Network (FIN) focuses on learning the feature importance instead of directly learning the attention map to obtain a more reliable and consistent attention map. We confirmed that LFI-CAM is optimized not only by learning the feature importance but also by enhancing the backbone feature representation to focus more on important features of the input image. Experiments show that LFI-CAM outperforms baseline models' accuracy on classification tasks as well as significantly improves on previous works in terms of attention map quality and stability over different hyper-parameters.

Related Material


[pdf] [supp]
[bibtex]
@InProceedings{Lee_2021_ICCV, author = {Lee, Kwang Hee and Park, Chaewon and Oh, Junghyun and Kwak, Nojun}, title = {LFI-CAM: Learning Feature Importance for Better Visual Explanation}, booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)}, month = {October}, year = {2021}, pages = {1355-1363} }