Sparse Activation Maps for Interpreting 3D Object Detection
We propose a technique to generate "visual explanations" for interpretability of volumetric-based 3D object detection networks. Specifically, we use the average pooling of weights to produce a Sparse Activation Map (SAM) which highlights the important regions of the 3D point cloud data. The SAMs is applicable to any volumetric-based models (model agnostic) to provide intuitive intermediate results at different layers to understand the complexity of network structures. The SAMs at the 3D feature map layer and the 2D feature map layer help to understand the effectiveness of neurons to capture the object information. The SAMs at the classification layer for each object class helps to understand the true positives and false positives of each network. The experimental results on the KITTI dataset demonstrate the visual observations of the SAM match the detection results of three volumetric-based models.