Prune Efficiently by Soft Pruning

Agarwal, Parakh; Mathew, Manu; Patel, Kunal Ranjan; Tripathi, Varun; Swami, Pramod

Parakh Agarwal, Manu Mathew, Kunal Ranjan Patel, Varun Tripathi, Pramod Swami; Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2024, pp. 2210-2217

Abstract

Embedded systems are power sensitive and have limited memory hence inferencing large networks on such systems is difficult. Pruning techniques have been instrumental in enhancing the efficiency of state-of-the-art convolutional neural networks on embedded systems. Traditional algorithms tend to eliminate weights abruptly during training which may not provide the best accuracy. The proposed approach called Soft Pruning using Weight Blending algorithm (SPWB) is designed to retain critical information by incrementally reducing the network's weights to zero. Additionally our method of channel pruning is cognizant of connections allowing for optimal pruning that renders the network compatible with various inference engines. The findings demonstrate that SPWB algorithm can reduce computational complexity (measured in FLOPs) to half for ResNet50 with only a minimal impact - a 0.65% decrease - in top-1 accuracy on the ImageNet dataset. We also present our pruning results for both unstructured weight sparsity as well as channel sparsity. Our method is easy to use and provides enhancement in the network's performance and efficiency without compromising accuracy. The method is available as a python package and can be easily integrated to other training scripts.

Related Material

[pdf]

[bibtex]

@InProceedings{Agarwal_2024_CVPR, author = {Agarwal, Parakh and Mathew, Manu and Patel, Kunal Ranjan and Tripathi, Varun and Swami, Pramod}, title = {Prune Efficiently by Soft Pruning}, booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops}, month = {June}, year = {2024}, pages = {2210-2217} }