SqueezeNext: Hardware-Aware Neural Network Design

Amir Gholami, Kiseok Kwon, Bichen Wu, Zizheng Tai, Xiangyu Yue, Peter Jin, Sicheng Zhao, Kurt Keutzer; Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2018, pp. 1638-1647


One of the main barriers for deploying neural networks on embedded systems has been large memory and power consumption of existing neural network architectures. In this work, we introduce SqueezeNext, a new family of neural network architectures whose design was guided by considering previous architectures such as SqueezeNet, as well as by simulation results on a neural network accelerator. This new neural network architecture is able to match AlexNet's accuracy on the ImageNet benchmark with 112xfewer parameters, and one of its deeper variants is able to achieve VGG-19 accuracy with only 4.4 Million parameters, (31xsmaller than VGG-19). SqueezeNext also achieves better top-5 classification accuracy with 1.3xfewer parameters as compared to MobileNet, but avoids using depthwise-separable convolutions that are inefficient on some mobile processor platforms. This wide range of accuracy gives the user the ability to make speed-accuracy tradeoffs, depending on the available resources on the target hardware. Using hardware simulation results for power and inference speed on a embedded system has guided us to design variations of the baseline model that achieved up to 20% better hardware utilization with minimal difference in accuracy.

Related Material

[pdf] [arXiv]
author = {Gholami, Amir and Kwon, Kiseok and Wu, Bichen and Tai, Zizheng and Yue, Xiangyu and Jin, Peter and Zhao, Sicheng and Keutzer, Kurt},
title = {SqueezeNext: Hardware-Aware Neural Network Design},
booktitle = {Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops},
month = {June},
year = {2018}