Enforcing Temporal Consistency in Video Depth Estimation

Li, Siyuan; Luo, Yue; Zhu, Ye; Zhao, Xun; Li, Yu; Shan, Ying

Siyuan Li, Yue Luo, Ye Zhu, Xun Zhao, Yu Li, Ying Shan; Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) Workshops, 2021, pp. 1145-1154

Abstract

Most existing monocular depth estimation methods are trained on single images and have unsatisfactory temporal stability in video prediction. They may rely on post processing to solve this issue. A few video based depth estimation methods use reconstruction framework like structure-from-motion or sequential modeling. These methods have assumptions in the scenarios that they can apply thus limits their real applications. In this work, we present a simple method for improving temporal consistency in video depth estimation. Specifically, we learn a prior from video data and this prior can be imposed directly into any single image monocular depth method. During testing, our method just performs end-to-end forward inference frame by frame without any sequential module or multi-frame module. In the mean while, we propose an evaluation metric that quantitatively measures temporal consistency of video depth predictions. It does not require labelled depth ground truths and only assesses flickering between consecutive frames. Experiments show our method can achieve improved temporal consistency in both standard benchmark and general cases without any post processing and extra computational cost. A subjective study indicates that our proposed metric is consistent with the visual perception of users, and our results with higher consistency scores are indeed preferred. These features make our method a practical video depth estimator to predict dense depth of real scenes and enable several video depth based applications.

Related Material

[pdf]

[bibtex]

@InProceedings{Li_2021_ICCV, author = {Li, Siyuan and Luo, Yue and Zhu, Ye and Zhao, Xun and Li, Yu and Shan, Ying}, title = {Enforcing Temporal Consistency in Video Depth Estimation}, booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) Workshops}, month = {October}, year = {2021}, pages = {1145-1154} }