Frequency Domain Compact 3D Convolutional Neural Networks

Chen, Hanting; Wang, Yunhe; Shu, Han; Tang, Yehui; Xu, Chunjing; Shi, Boxin; Xu, Chao; Tian, Qi; Xu, Chang

Hanting Chen, Yunhe Wang, Han Shu, Yehui Tang, Chunjing Xu, Boxin Shi, Chao Xu, Qi Tian, Chang Xu; Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2020, pp. 1641-1650

Abstract

This paper studies the compression and acceleration of 3-dimensional convolutional neural networks (3D CNNs). To reduce the memory cost and computational complexity of deep neural networks, a number of algorithms have been explored by discovering redundant parameters in pre-trained networks. However, most of existing methods are designed for processing neural networks consisting of 2-dimensional convolution filters (i.e. image classification and detection) and cannot be straightforwardly applied for 3-dimensional filters (i.e. time series data). In this paper, we develop a novel approach for eliminating redundancy in the time dimensionality of 3D convolution filters by converting them into the frequency domain through a series of learned optimal transforms with extremely fewer parameters. Moreover, these transforms are forced to be orthogonal, and the calculation of feature maps can be accomplished in the frequency domain to achieve considerable speed-up rates. Experimental results on benchmark 3D CNN models and datasets demonstrate that the proposed Frequency Domain Compact 3D CNNs (FDC3D) can achieve the state-of-the-art performance, e.g. a 2x speed-up ratio on the 3D-ResNet-18 without obviously affecting its accuracy.

Related Material

[pdf]

[bibtex]

@InProceedings{Chen_2020_CVPR,
author = {Chen, Hanting and Wang, Yunhe and Shu, Han and Tang, Yehui and Xu, Chunjing and Shi, Boxin and Xu, Chao and Tian, Qi and Xu, Chang},
title = {Frequency Domain Compact 3D Convolutional Neural Networks},
booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
month = {June},
year = {2020}
}