Deep Fried Convnets

Zichao Yang, Marcin Moczulski, Misha Denil, Nando de Freitas, Alex Smola, Le Song, Ziyu Wang; Proceedings of the IEEE International Conference on Computer Vision (ICCV), 2015, pp. 1476-1483

Abstract


The fully connected layers of a deep convolutional neural network typically contain over 90% of the network parameters, and consume the majority of the memory required to store the network. Reducing the number of parameters while preserving predictive performance is critically important for deploying deep neural networks in memory constrained environments such as GPUs or embedded devices. In this paper we show how kernel methods, in particular a single Fastfood layer, can be used to replace the fully connected layers in a deep convolutional neural network. This deep fried network is end-to-end trainable in conjunction with convolutional layers. Our new architecture substantially reduces the memory footprint of convolutional networks trained on MNIST and ImageNet with no drop in predictive performance

Related Material


[pdf]
[bibtex]
@InProceedings{Yang_2015_ICCV,
author = {Yang, Zichao and Moczulski, Marcin and Denil, Misha and de Freitas, Nando and Smola, Alex and Song, Le and Wang, Ziyu},
title = {Deep Fried Convnets},
booktitle = {Proceedings of the IEEE International Conference on Computer Vision (ICCV)},
month = {December},
year = {2015}
}