Deep Video Quality Assessor: From Spatio-temporal Visual Sensitivity to A Convolutional Neural Aggregation Network

Woojae Kim, Jongyoo Kim, Sewoong Ahn, Jinwoo Kim, Sanghoon Lee; Proceedings of the European Conference on Computer Vision (ECCV), 2018, pp. 219-234

Abstract


Incorporating spatio-temporal human visual perception into video quality assessment (VQA) remains a formidable issue. Previous statistical or computational models of spatio-temporal perception have limitations to be applied to the general VQA algorithms. In this paper, we propose a novel full-reference (FR) VQA framework named Deep Video Quality Assessor (DeepVQA) to quantify the spatio-temporal visual perception via a convolutional neural network (CNN) and a convolutional neural aggregation network (CNAN). Our framework enables to figure out the spatio-temporal sensitivity behavior through learning in accordance with the subjective score. In addition, to manipulate the temporal variation of distortions, we propose a novel temporal pooling method using an attention model. In the experiment, we show DeepVQA remarkably achieves the state-of-the-art prediction accuracy of more than 0.9 correlation, which is ~5% higher than those of conventional methods on the LIVE and CSIQ video databases.

Related Material


[pdf]
[bibtex]
@InProceedings{Kim_2018_ECCV,
author = {Kim, Woojae and Kim, Jongyoo and Ahn, Sewoong and Kim, Jinwoo and Lee, Sanghoon},
title = {Deep Video Quality Assessor: From Spatio-temporal Visual Sensitivity to A Convolutional Neural Aggregation Network},
booktitle = {Proceedings of the European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}