“Learning-Compression” Algorithms for Neural Net Pruning

Miguel Á. Carreira-Perpiñán, Yerlan Idelbayev; Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018, pp. 8532-8541


Pruning a neural net consists of removing weights without degrading its performance. This is an old problem of renewed interest because of the need to compress ever larger nets so they can run in mobile devices. Pruning has been traditionally done by ranking or penalizing weights according to some criterion (such as magnitude), removing low-ranked weights and retraining the remaining ones. We formulate pruning as an optimization problem of finding the weights that minimize the loss while satisfying a pruning cost condition. We give a generic algorithm to solve this which alternates "learning" steps that optimize a regularized, data-dependent loss and "compression" steps that mark weights for pruning in a data-independent way. Magnitude thresholding arises naturally in the compression step, but unlike existing magnitude pruning approaches, our algorithm explores subsets of weights rather than committing irrevocably to a specific subset from the beginning. It is also able to learn automatically the best number of weights to prune in each layer of the net without incurring an exponentially costly model selection. Using a single pruning-level user parameter, we achieve state-of-the-art pruning in nets of various sizes.

Related Material

[pdf] [supp] [video]
author = {Carreira-Perpiñán, Miguel Á. and Idelbayev, Yerlan},
title = {“Learning-Compression” Algorithms for Neural Net Pruning},
booktitle = {Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
month = {June},
year = {2018}