Visual Explanation Generation Based on Lambda Attention Branch Networks

Tsumugi Iida, Takumi Komatsu, Kanta Kaneda, Tsubasa Hirakawa, Takayoshi Yamashita, Hironobu Fujiyoshi, Komei Sugiura; Proceedings of the Asian Conference on Computer Vision (ACCV), 2022, pp. 3536-3551

Abstract


Explanation generation for transformers enhances accountability for their predictions. However, there have been few studies on generating visual explanations for the transformers that use multidimensional context, such as LambdaNetworks. In this paper, we propose the Lambda Attention Branch Networks, which attend to important regions in detail and generate easily interpretable visual explanations. We also propose the Patch Insertion-Deletion score, an extension of the Insertion-Deletion score, as an effective evaluation metric for images with sparse important regions. Experimental results on two public datasets indicate that the proposed method successfully generates visual explanations.

Related Material


[pdf] [supp] [code]
[bibtex]
@InProceedings{Iida_2022_ACCV, author = {Iida, Tsumugi and Komatsu, Takumi and Kaneda, Kanta and Hirakawa, Tsubasa and Yamashita, Takayoshi and Fujiyoshi, Hironobu and Sugiura, Komei}, title = {Visual Explanation Generation Based on Lambda Attention Branch Networks}, booktitle = {Proceedings of the Asian Conference on Computer Vision (ACCV)}, month = {December}, year = {2022}, pages = {3536-3551} }