Training Sparse Neural Networks

Suraj Srinivas, Akshayvarun Subramanya, R. Venkatesh Babu; The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2017, pp. 138-145


The emergence of Deep neural networks has seen human-level performance on large scale computer vision tasks such as image classification. However these deep networks typically contain large amount of parameters due to dense matrix multiplications and convolutions. As a result, these architectures are highly memory intensive, making them less suitable for embedded vision applications. Sparse Computations are known to be much more memory efficient. In this work, we train and build neural networks which implicitly use sparse computations. We introduce additional gate variables to perform parameter selection and show that this is equivalent to using a spike-and-slab prior. We experimentally validate our method on both small and large networks which result in highly sparse neural network models.

Related Material

[pdf] [arXiv]
author = {Srinivas, Suraj and Subramanya, Akshayvarun and Venkatesh Babu, R.},
title = {Training Sparse Neural Networks},
booktitle = {The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops},
month = {July},
year = {2017}