Channel-Max, Channel-Drop and Stochastic Max-Pooling

Yuchi Huang, Xiuyu Sun, Ming Lu, Ming Xu; Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2015, pp. 9-17


We propose three regularization techniques to overcome drawbacks of local winner-take-all methods used in deep convolutional networks. Channel-Max inherits the max activation unit from Maxout networks, but otherwise adopts complementary subsets of input and filters with different kernel sizes as better companions to the max function. To balance the training on different pathways, Channel-Drop is employed to randomly discard half pathways before their inputs are convolved respectively. Stochastic Max-pooling is defined to reduce the overfitting caused by conventional max-pooling, in which half activations are randomly dropped in each pooling region during training and top largest activations are probabilistically averaged during testing. Using Channel-Max, Channel-Drop and Stochastic Max-pooling, we demonstrate state-of-the-art performance on four benchmark datasets: CIFAR-10, CIFAR-100, STL-10 and SVHN.

Related Material

author = {Huang, Yuchi and Sun, Xiuyu and Lu, Ming and Xu, Ming},
title = {Channel-Max, Channel-Drop and Stochastic Max-Pooling},
booktitle = {Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops},
month = {June},
year = {2015}