A Compressive Prior Guided Mask Predictive Coding Approach for Video Analysis

Zhimeng Huang, Chuanmin Jia, Shanshe Wang, Siwei Ma; Proceedings of the Asian Conference on Computer Vision (ACCV), 2022, pp. 4011-4026

Abstract


In real-world scenarios, video analysis algorithms are conducted for visual signals after compression and transmission. Generally speaking, most codecs introduce irreversible distortion due to coarse quantization during compression. The distortion may lead to significant perception degradation in terms of video analysis performance. To tackle this problem, we propose an efficient plug-and-play approach to preserve the essential semantic information in video sequences explicitly. The proposed approach could boost the video analysis performance with a little extra bit cost. Specifically, we employ the proposed approach on an emerging video analysis task, video object segmentation(VOS). Massive experimental results prove that the our work outperforms the existing coding approaches over multiple VOS datasets. Concretely, it could improve the analysis performance by up to 13% at similar bitrates. Additional experiments also verifies the flexibility of our scheme because there is no dependency on any specific VOS model or encoding method. Essentially, the proposed approach provides novel insights for the emerging Video Coding for Machine (VCM) standard.

Related Material


[pdf] [supp] [code]
[bibtex]
@InProceedings{Huang_2022_ACCV, author = {Huang, Zhimeng and Jia, Chuanmin and Wang, Shanshe and Ma, Siwei}, title = {A Compressive Prior Guided Mask Predictive Coding Approach for Video Analysis}, booktitle = {Proceedings of the Asian Conference on Computer Vision (ACCV)}, month = {December}, year = {2022}, pages = {4011-4026} }