Capsule Network Is Not More Robust Than Convolutional Network

Gu, Jindong; Tresp, Volker; Hu, Han

Jindong Gu, Volker Tresp, Han Hu; Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2021, pp. 14309-14317

Abstract

The Capsule Network is widely believed to be more robust than Convolutional Networks. However, there lack comprehensive comparisons between these two networks, and it is also unknown which components in the CapsNet affect its robustness. In this paper, we first carefully examine the special designs in CapsNet differing from that of a ConvNet, commonly used for image classification. The examination reveals 5 major new/different components in CapsNet: a transformation process, a dynamic routing layer, a squashing function, a marginal loss other than cross-entropy loss, and an additional class-conditional reconstruction loss for regularization. Along with these major differences, we comprehensively ablate their behavior on 3 kinds of robustness, including affine transformation, overlapping digits, and semantic representation. The study reveals that some designs which are thought critical to CapsNet actually can harm its robustness, i.e., the dynamic routing layer and the transformation process, while others are beneficial for the robustness. Based on these findings, we propose enhanced ConvNets simply by introducing the essential components behind the CapsNet's success. The proposed simple ConvNets can achieve better robustness than the CapsNet.

Related Material

[pdf] [supp] [arXiv]

[bibtex]

@InProceedings{Gu_2021_CVPR, author = {Gu, Jindong and Tresp, Volker and Hu, Han}, title = {Capsule Network Is Not More Robust Than Convolutional Network}, booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)}, month = {June}, year = {2021}, pages = {14309-14317} }