JointSQ: Joint Sparsification-Quantization for Distributed Learning

Xie, Weiying; Li, Haowei; Ma, Jitao; Li, Yunsong; Lei, Jie; Liu, Donglai; Fang, Leyuan

Weiying Xie, Haowei Li, Jitao Ma, Yunsong Li, Jie Lei, Donglai Liu, Leyuan Fang; Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024, pp. 5778-5787

Abstract

Gradient sparsification and quantization offer a promising prospect to alleviate the communication overhead problem in distributed learning. However direct combination of the two results in suboptimal solutions due to the fact that sparsification and quantization haven't been learned together. In this paper we propose Joint Sparsification-Quantization (JointSQ) inspired by the discovery that sparsification can be treated as 0-bit quantization regardless of architectures. Specifically we mathematically formulate JointSQ as a mixed-precision quantization problem expanding the solution space. It can be solved by the designed MCKP-Greedy algorithm. Theoretical analysis demonstrates the minimal compression noise of JointSQ and extensive experiments on various network architectures including CNN RNN and Transformer also validate this point. Under the introduction of computation overhead consistent with or even lower than previous methods JointSQ achieves a compression ratio of 1000xon different models while maintaining near-lossless accuracy and brings 1.4xto 2.9xspeedup over existing methods.

Related Material

[pdf] [supp]

[bibtex]

@InProceedings{Xie_2024_CVPR, author = {Xie, Weiying and Li, Haowei and Ma, Jitao and Li, Yunsong and Lei, Jie and Liu, Donglai and Fang, Leyuan}, title = {JointSQ: Joint Sparsification-Quantization for Distributed Learning}, booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)}, month = {June}, year = {2024}, pages = {5778-5787} }