Single-Shot Pruning for Pre-Trained Models: Rethinking the Importance of Magnitude Pruning

Kohama, Hirokazu; Minoura, Hiroaki; Hirakawa, Tsubasa; Yamashita, Takayoshi; Fujiyoshi, Hironobu

Hirokazu Kohama, Hiroaki Minoura, Tsubasa Hirakawa, Takayoshi Yamashita, Hironobu Fujiyoshi; Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) Workshops, 2023, pp. 1433-1442

Abstract

Transformer models with large-scale pre-training have performed excellently in various computer vision tasks. However, such models are huge and difficult to apply to mobile devices with limited computational resources. Moreover, the computational cost of fine-tuning is high when the model is optimized for a downstream task. Therefore, our goal is to compress the large pre-trained models with minimal performance degradation before fine-tuning. In this paper, we first present the preliminary experimental results on the parameter change by using pre-trained or scratch models when training in a downstream task. We found that the parameter magnitudes of pre-trained models remained largely unchanged before and after training compared with scratch models. With this in mind, we propose an unstructured pruning method for pre-trained models. Our method evaluates the parameters without training and prunes in a single shot to obtain sparse models. Our experiment results show that the sparse model pruned by our method has higher accuracy is more than previous methods on the CIFAR-10, CIFAR-100, and ImageNet classification tasks.

Related Material

[pdf]

[bibtex]

@InProceedings{Kohama_2023_ICCV, author = {Kohama, Hirokazu and Minoura, Hiroaki and Hirakawa, Tsubasa and Yamashita, Takayoshi and Fujiyoshi, Hironobu}, title = {Single-Shot Pruning for Pre-Trained Models: Rethinking the Importance of Magnitude Pruning}, booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) Workshops}, month = {October}, year = {2023}, pages = {1433-1442} }