Improved Techniques for Quantizing Deep Networks With Adaptive Bit-Widths

Sun, Ximeng; Panda, Rameswar; Chen, Chun-Fu Richard; Wang, Naigang; Pan, Bowen; Oliva, Aude; Feris, Rogerio; Saenko, Kate

Ximeng Sun, Rameswar Panda, Chun-Fu Richard Chen, Naigang Wang, Bowen Pan, Aude Oliva, Rogerio Feris, Kate Saenko; Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), 2024, pp. 957-967

Abstract

Quantizing deep networks with adaptive bit-widths is a promising technique for efficient inference across many devices and resource constraints. In contrast to static methods that repeat the quantization process and train different models for different constraints, adaptive quantization enables us to flexibly adjust the bit-widths of a single deep network during inference for instant adaptation in different scenarios. While existing research shows encouraging results on common image classification benchmarks, this paper investigates how to train such adaptive networks more effectively. Specifically, we present two novel techniques for quantizing deep neural networks with adaptive bit-widths of weights and activations. First, we propose a collaborative strategy to choose a high-precision "teacher" for transferring knowledge to the low-precision "student" while jointly optimizing the model with all bit-widths. Second, to effectively transfer knowledge, we develop a dynamic block swapping method by randomly replacing the blocks in the lower-precision student network with the corresponding blocks in the higher-precision teacher network. Extensive experiments on multiple image classification datasets and novel video classification experiments, well demonstrate the efficacy of our approach over state-of-the-art methods.

Related Material

[pdf] [supp] [arXiv]

[bibtex]

@InProceedings{Sun_2024_WACV, author = {Sun, Ximeng and Panda, Rameswar and Chen, Chun-Fu Richard and Wang, Naigang and Pan, Bowen and Oliva, Aude and Feris, Rogerio and Saenko, Kate}, title = {Improved Techniques for Quantizing Deep Networks With Adaptive Bit-Widths}, booktitle = {Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)}, month = {January}, year = {2024}, pages = {957-967} }