Data-Free Model Pruning at Initialization via Expanders
In light of the enormous computational resources required to store and train modern deep learning models, significant research has focused on model compression. When deploying compressed networks on remote devices prior to training them, a compression scheme cannot use any training data or derived information (e.g., gradients). This leaves only the structure of the network to work with, and existing literature on how graph structure affects network performance is scarce. Recently, expander graphs have been put forward as a tool for sparsifying neural architectures. Unfortunately, however, existing models can rarely outperform a naive random baseline. In this work, we propose a stronger model for generating expanders, which we then use to sparsify a variety of mainstream CNN architectures. We demonstrate that accuracy is an increasing function of expansion in a sparse model, and both analyse and elucidate its superior performance over alternative models.