RAVN: Reinforcement Aided Adaptive Vector Quantization of Deep Neural Networks

Anamika Jha, Aratrik Chattopadhyay, Mrinal Banerji, Disha Jain; Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2024, pp. 2200-2209

Abstract


In the expanding field of deep learning deploying deep neural networks (DNNs) in resource-constrained environments presents daunting challenges due to their complexity. Existing methodologies try to reduce the model complexity through the quantization of the DNNs. Adaptive quantization (AQ) is one such quantization technique for reducing model complexity. The drawbacks of current adaptive quantization techniques include limited adaptability to different datasets and models suboptimal codebook generation high computational complexity and limited generalization to unseen scenarios. In contrast we propose to address these issues through a sophisticated AQ methodology which incorporates vector quantization (VQ) of weights and Quantization-Aware Training (QAT) in tandem with reinforcement learning (RL). The above-mentioned approach facilitates dynamic allocation of quantization parameters of the DNN models thereby reducing complexity power utilization and ease of deployment on edge devices. We evaluated our proposed approach on three publicly available benchmark datasets namely CIFAR-10 CIFAR-100 and ImageNet on state-of-the-art floating-point DNN architectures and showed a boost of up to 4% in their respective quantized counterparts.

Related Material


[pdf]
[bibtex]
@InProceedings{Jha_2024_CVPR, author = {Jha, Anamika and Chattopadhyay, Aratrik and Banerji, Mrinal and Jain, Disha}, title = {RAVN: Reinforcement Aided Adaptive Vector Quantization of Deep Neural Networks}, booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops}, month = {June}, year = {2024}, pages = {2200-2209} }