Improving Low-Precision Network Quantization via Bin Regularization

Han, Tiantian; Li, Dong; Liu, Ji; Tian, Lu; Shan, Yi

Tiantian Han, Dong Li, Ji Liu, Lu Tian, Yi Shan; Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2021, pp. 5261-5270

Abstract

Model quantization is an important mechanism for energy-efficient deployment of deep neural networks on resource-constrained devices by reducing the bit precision of weights and activations. However, it remains challenging to maintain high accuracy as bit precision decreases, especially for low-precision networks (e.g., 2-bit MobileNetV2). Existing methods have explored to address this problem by minimizing the quantization error or mimicking the data distribution of full-precision networks. In this work, we propose a novel weight regularization algorithm for improving low-precision network quantization. Instead of constraining the overall data distribution, we separably optimize all elements in each quantization bin to be as close to the target quantized value as possible. Such bin regularization (BR) mechanism encourages the weight distribution of each quantization bin to be sharp and approximate to a Dirac delta distribution ideally. Experiments demonstrate that our method achieves consistent improvements over the state-of-the-art quantization-aware training methods for different low-precision networks. Particularly, our bin regularization improves LSQ for 2-bit MobileNetV2 and MobileNetV3-Small by 3.9% and 4.9% top-1 accuracy on ImageNet, respectively.

Related Material

[pdf] [supp]

[bibtex]

@InProceedings{Han_2021_ICCV, author = {Han, Tiantian and Li, Dong and Liu, Ji and Tian, Lu and Shan, Yi}, title = {Improving Low-Precision Network Quantization via Bin Regularization}, booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)}, month = {October}, year = {2021}, pages = {5261-5270} }