Joint Motion and Residual Information Latent Representation for P-Frame Coding

Renam Castro da Silva, Nilson Donizete Guerin Jr., Pedro Sanches, Henrique Costa Jung, Eduardo Peixoto, Bruno Macchiavello, Edson M. Hung, Vanessa Testoni, Pedro Garcia Freitas; Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2020, pp. 146-147

Abstract


This paper proposes an inter-frame prediction frame encoding for the P-frame video compression challenge of the Workshop and Challenge on Learned Image Compression (CLIC). For this challenge, we use an uncompressed reference (previous) frame to compress the current frame. So, this is not a complete solution for learning-based video compression. The main goal is to represent a set of frames with an average of 0.075 bpp (bits per pixel), which is a very low bitrate. A restriction on the model size is also requested to avoid overfitting. Here we propose an autoencoder architecture that jointly represents the motion and residue information at the latent space. Three trained models were used to achieve the target bpp and a bit allocation algorithm is also proposed to optimize the quality performance of the encoded dataset.

Related Material


[pdf]
[bibtex]
@InProceedings{Silva_2020_CVPR_Workshops,
author = {da Silva, Renam Castro and Guerin Jr., Nilson Donizete and Sanches, Pedro and Jung, Henrique Costa and Peixoto, Eduardo and Macchiavello, Bruno and Hung, Edson M. and Testoni, Vanessa and Freitas, Pedro Garcia},
title = {Joint Motion and Residual Information Latent Representation for P-Frame Coding},
booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops},
month = {June},
year = {2020}
}