Refining Activation Downsampling With SoftPool

Alexandros Stergiou, Ronald Poppe, Grigorios Kalliatakis; Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2021, pp. 10357-10366

Abstract


Convolutional Neural Networks (CNNs) use pooling to decrease the size of activation maps. This process is crucial to increase the receptive fields and to reduce computational requirements of subsequent convolutions. An important feature of the pooling operation is the minimization of information loss, with respect to the initial activation maps, without a significant impact on the computation and memory overhead. To meet these requirements, we propose SoftPool: a fast and efficient method for exponentially weighted activation downsampling. Through experiments across a range of architectures and pooling methods, we demonstrate that SoftPool can retain more information in the reduced activation maps. This refined downsampling leads to improvements in a CNN's classification accuracy. Experiments with pooling layer substitutions on ImageNet1K show an increase in accuracy over both original architectures and other pooling methods. We also test SoftPool on video datasets for action recognition. Again, through the direct replacement of pooling layers, we observe consistent performance improvements while computational loads and memory requirements remain limited.

Related Material


[pdf] [supp] [arXiv]
[bibtex]
@InProceedings{Stergiou_2021_ICCV, author = {Stergiou, Alexandros and Poppe, Ronald and Kalliatakis, Grigorios}, title = {Refining Activation Downsampling With SoftPool}, booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)}, month = {October}, year = {2021}, pages = {10357-10366} }