Event-Driven Video Frame Synthesis

Zihao W. Wang, Weixin Jiang, Kuan He, Boxin Shi, Aggelos Katsaggelos, Oliver Cossairt; Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2019, pp. 0-0


Temporal Video Frame Synthesis (TVFS) aims at synthesizing novel frames at timestamps different from existing frames, which has wide applications in video codec, editing and analysis. In this paper, we propose a high frame-rate TVFS framework which takes hybrid input data from a low-speed frame-based sensor and a high-speed event-based sensor. Compared to frame-based sensors, event-based sensors report brightness changes at very high speed, which may well provide useful spatio-temoral information for high frame-rate TVFS. Therefore, we first introduce a differentiable fusion model to approximate the dual-modal physical sensing process, unifying a variety of TVFS scenarios, e.g., interpolation, prediction and motion deblur. Our differentiable model enables iterative optimization of the latent video tensor via autodifferentiation, which propagates the gradients of a loss function defined on the measured data. Our differentiable model-based reconstruction does not involve training, yet is parallelizable and can be implemented on machine learning platforms (such as TensorFlow). Second, we develop a deep learning strategy to enhance the results from the first step, which we refer as a residual "denoising" process. Our trained "denoiser" is beyond Gaussian denoising and shows properties such as contrast enhancement and motion awareness. We show that our framework is capable of handling challenging scenes including both fast motion and strong occlusions.

Related Material

author = {Wang, Zihao W. and Jiang, Weixin and He, Kuan and Shi, Boxin and Katsaggelos, Aggelos and Cossairt, Oliver},
title = {Event-Driven Video Frame Synthesis},
booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) Workshops},
month = {Oct},
year = {2019}