Partial Binarization of Neural Networks for Budget-Aware Efficient Learning

Udbhav Bamba, Neeraj Anand, Saksham Aggarwal, Dilip K. Prasad, Deepak K. Gupta; Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), 2024, pp. 2336-2345

Abstract


Binarization is a powerful compression technique for neural networks, significantly reducing FLOPs, but often results in a significant drop in model performance. To address this issue, partial binarization techniques have been developed, but a systematic approach to mixing binary and full-precision parameters in a single network is still lacking. In this paper, we propose a controlled approach to partial binarization, creating a budgeted binary neural network (B2NN) with our MixBin strategy. This method optimizes the mixing of binary and full-precision components, allowing for explicit selection of the fraction of the network to remain binary. Our experiments show that B2NNs created using MixBin outperform those from random or iterative searches and state-of-the-art layer selection methods by up to 3% on the ImageNet-1K dataset. We also show that B2NNs outperform the structured pruning baseline by approximately 23% at the extreme FLOP budget of 15%, and perform well in object tracking, with up to a 12.4% relative improvement over other baselines. Additionally, we demonstrate that B2NNs developed by MixBin can be transferred across datasets, with some cases showing improved performance over directly applying MixBin on the downstream data.

Related Material


[pdf] [supp] [arXiv]
[bibtex]
@InProceedings{Bamba_2024_WACV, author = {Bamba, Udbhav and Anand, Neeraj and Aggarwal, Saksham and Prasad, Dilip K. and Gupta, Deepak K.}, title = {Partial Binarization of Neural Networks for Budget-Aware Efficient Learning}, booktitle = {Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)}, month = {January}, year = {2024}, pages = {2336-2345} }