Practical Stacked Non-local Attention Modules for Image Compression

Haojie Liu, Tong Chen, Qiu Shen, Zhan Ma; The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2019, pp. 0-0

Abstract


In this paper, we proposed a stacked non-local attention based variational autoencoder (VAE) for learned image compression. We use a non-local module to capture global correlations effectively that can't be offered by traditional convolutional neural networks (CNNs). Meanwhile, layer-wise self-attention mechanisms are widely used to activate/preserve important and challenging regions. We jointly take the hyperpriors and autoregressive priors for conditional probability estimation. For practical application, we have implemented a sparse non-local processing via maxpooling to greatly reduce the memory consumption, and masked 3D convolutions to support parallel processing for autoregressive priors based probability prediction. A post-processing network is then concatenated and trained with decoder jointly for quality enhancement. We have evaluated our model using public CLIC2019 validation and test dataset, offering averaged 0.9753 and 0.9733 respectively when evaluated using multi-scale structural similarity (MS-SSIM) with bit rate less than 0.15 bits per pixel (bpp).

Related Material


[pdf]
[bibtex]
@InProceedings{Liu_2019_CVPR_Workshops,
author = {Liu, Haojie and Chen, Tong and Shen, Qiu and Ma, Zhan},
title = {Practical Stacked Non-local Attention Modules for Image Compression},
booktitle = {The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops},
month = {June},
year = {2019}
}