VideoMatt: A Simple Baseline for Accessible Real-Time Video Matting

Jiachen Li, Marianna Ohanyan, Vidit Goel, Shant Navasardyan, Yunchao Wei, Humphrey Shi; Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2023, pp. 2177-2186

Abstract


Recently, real-time video matting has received growing attention from academia and industry as a new research area on the rise. However, most current state-of-the-art solutions are trained and evaluated on private or inaccessible matting datasets, which makes it hard for future researchers to conduct fair comparisons among different models. Moreover, most methods are built upon image matting models with various tricks across frames to boost matting quality. For real-time video matting models, simple and effective temporal modeling methods must be explored better. As a result, we first composite a new video matting benchmark that is purely based on publicly accessible datasets for training and testing. We further empirically investigate various temporal modeling methods and compare their performance in matting accuracy and inference speed. We name our method VideoMatt: a simple and strong real-time video matting baseline model based on a newly-composited accessible benchmark. Extensive experiments show that our VideoMatt variants reach better trade-offs between inference speed and matting quality compared with other state-of-the-art methods for real-time trimap-free video matting. We release the VideoMatt benchmark at https://drive.google.com/file/d/1QT4KHeGW3YrtBs1_7zovdCwCAofQ_GIj/view?usp=sharing.

Related Material


[pdf] [supp]
[bibtex]
@InProceedings{Li_2023_CVPR, author = {Li, Jiachen and Ohanyan, Marianna and Goel, Vidit and Navasardyan, Shant and Wei, Yunchao and Shi, Humphrey}, title = {VideoMatt: A Simple Baseline for Accessible Real-Time Video Matting}, booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops}, month = {June}, year = {2023}, pages = {2177-2186} }