Towards Memory-Efficient Neural Networks via Multi-Level In Situ Generation

Gu, Jiaqi; Zhu, Hanqing; Feng, Chenghao; Liu, Mingjie; Jiang, Zixuan; Chen, Ray T.; Pan, David Z.

Jiaqi Gu, Hanqing Zhu, Chenghao Feng, Mingjie Liu, Zixuan Jiang, Ray T. Chen, David Z. Pan; Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2021, pp. 5229-5238

Abstract

Deep neural networks (DNN) have shown superior performance in a variety of tasks. As they rapidly evolve, their escalating computation and memory demands make it challenging to deploy them on resource-constrained edge devices. Though extensive efficient accelerator designs, from traditional electronics to emerging photonics, have been successfully demonstrated, they are still bottlenecked by expensive memory accesses due to tremendous gaps between the bandwidth/power/latency of electrical memory and computing cores. Previous solutions fail to fully-leverage the ultra-fast computational speed of emerging DNN accelerators to break through the critical memory bound. In this work, we propose a general and unified framework to trade expensive memory transactions with ultra-fast on-chip computations, directly translating to performance improvement. We are the first to jointly explore the intrinsic correlations and bit-level redundancy within DNN kernels and propose a multi-level in situ generation mechanism with mixed-precision bases to achieve on-the-fly recovery of high-resolution parameters with minimum hardware overhead. Extensive experiments demonstrate that our proposed joint method can boost the memory efficiency by 10-20x with comparable accuracy over four state-of-the-art designs when benchmarked on ResNet-18/DenseNet-121/MobileNetV2/V3 with various tasks.

Related Material

[pdf] [arXiv]

[bibtex]

@InProceedings{Gu_2021_ICCV, author = {Gu, Jiaqi and Zhu, Hanqing and Feng, Chenghao and Liu, Mingjie and Jiang, Zixuan and Chen, Ray T. and Pan, David Z.}, title = {Towards Memory-Efficient Neural Networks via Multi-Level In Situ Generation}, booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)}, month = {October}, year = {2021}, pages = {5229-5238} }