Structured Weight Unification and Encoding for Neural Network Compression and Acceleration

Wei Jiang, Wei Wang, Shan Liu; Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2020, pp. 714-715

Abstract


We investigate structured joint weight unification and weight encoding to compress deep neural network models for reduced storage and computation. A structured weight unification method is proposed, where weight coefficients are unified according to a hardware-friendly structure, so that the unified weights can be effectively encoded and the inference computation can be accelerated. Our method can be seen as a generalization of structured weight pruning, where we unify weights of a selected structure to share some value instead of removing them. A 3D pyramid-based encoding method is further proposed to team up with the structurally learned weights, providing a systematic solution for compressing neural network models while preserving the network capacity and the original prediction performance. Also, we develop a training framework to iteratively optimize the subproblems of weight unification and target prediction, which ensures the unification rate with little prediction loss. Experiments over several benchmark models and datasets of different tasks demonstrate the effectiveness of our approach.

Related Material


[pdf]
[bibtex]
@InProceedings{Jiang_2020_CVPR_Workshops,
author = {Jiang, Wei and Wang, Wei and Liu, Shan},
title = {Structured Weight Unification and Encoding for Neural Network Compression and Acceleration},
booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops},
month = {June},
year = {2020}
}