Discrete Model Compression With Resource Constraint for Deep Neural Networks

Shangqian Gao, Feihu Huang, Jian Pei, Heng Huang; The IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2020, pp. 1899-1908

Abstract


In this paper, we target to address the problem of compression and acceleration of Convolutional Neural Networks (CNNs). Specifically, we propose a novel structural pruning method to obtain a compact CNN with strong discriminative power. To find such networks, we propose an efficient discrete optimization method to directly optimize channel-wise differentiable discrete gate under resource constraint while freezing all the other model parameters. Although directly optimizing discrete variables is a complex non-smooth, non-convex and NP-hard problem, our optimization method can circumvent these difficulties by using the straight-through estimator. Thus, our method is able to ensure that the sub-network discovered within the training process reflects the true sub-network. We further extend the discrete gate to its stochastic version in order to thoroughly explore the potential sub-networks. Unlike many previous methods requiring per-layer hyper-parameters, we only require one hyper-parameter to control FLOPs budget. Moreover, our method is globally discrimination-aware due to the discrete setting. The experimental results on CIFAR-10 and ImageNet show that our method is competitive with state-of-the-art methods.

Related Material


[pdf] [supp]
[bibtex]
@InProceedings{Gao_2020_CVPR,
author = {Gao, Shangqian and Huang, Feihu and Pei, Jian and Huang, Heng},
title = {Discrete Model Compression With Resource Constraint for Deep Neural Networks},
booktitle = {The IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
month = {June},
year = {2020}
}