SMM-Conv: Scalar Matrix Multiplication With Zero Packing for Accelerated Convolution

Ofir, Amir; Ben-Artzi, Gil

SMM-Conv: Scalar Matrix Multiplication With Zero Packing for Accelerated Convolution

Amir Ofir, Gil Ben-Artzi; Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2022, pp. 3067-3075

Abstract

We present a novel approach for accelerating convolutions during inference for CPU-based architectures. The most common method of computation involves packing the image into the columns of a matrix (im2col) and performing general matrix multiplication (GEMM) with a matrix of weights. This results in two main drawbacks: (a) im2col requires a large memory buffer and can experience inefficient memory access, and (b) while GEMM is highly optimized for scientific matrices multiplications, it is not well suited for convolutions. We propose an approach that takes advantage of scalar-matrix multiplication and reduces memory overhead. Our experiments with commonly used network architectures demonstrate a significant speedup compared to existing indirect methods.

Related Material

[pdf]

[bibtex]

@InProceedings{Ofir_2022_CVPR, author = {Ofir, Amir and Ben-Artzi, Gil}, title = {SMM-Conv: Scalar Matrix Multiplication With Zero Packing for Accelerated Convolution}, booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops}, month = {June}, year = {2022}, pages = {3067-3075} }