Learning Non-Uniform Step Sizes for Neural Network Quantization

Gongyo, Shinya; Liang, Jinrong; Ambai, Mitsuru; Kawakami, Rei; Sato, Ikuro

Shinya Gongyo, Jinrong Liang, Mitsuru Ambai, Rei Kawakami, Ikuro Sato; Proceedings of the Asian Conference on Computer Vision (ACCV), 2024, pp. 4385-4402

Abstract

Quantization of neural networks enables faster inference, reduced memory usage, and lower energy consumption, all of which are crucial for deploying AI algorithms on devices. However, quantization may degrade performance compared to full-precision models as precision decreases. While prior research has primarily focused on uniformly quantizing network weights and activations, capturing the long-tail distributions of these quantities imposes a challenge. To address this issue, this paper introduces a non-uniform learned step-size quantization (nuLSQ) approach. It optimizes individual step sizes for quantizing weights and activations. Evaluations on CIFAR-10/100 and ImageNet datasets, using ResNet, MobileNetV2, Swin-T, and ConvNeXT with 2-, 3-, and 4-bit precisions, demonstrate that nuLSQ outperforms other quantization methods. The code is available at https://github.com/DensoITLab/nuLSQ.

Related Material

[pdf] [supp]

[bibtex]

@InProceedings{Gongyo_2024_ACCV, author = {Gongyo, Shinya and Liang, Jinrong and Ambai, Mitsuru and Kawakami, Rei and Sato, Ikuro}, title = {Learning Non-Uniform Step Sizes for Neural Network Quantization}, booktitle = {Proceedings of the Asian Conference on Computer Vision (ACCV)}, month = {December}, year = {2024}, pages = {4385-4402} }