Spatio-temporal Transformer Network for Video Restoration

Tae Hyun Kim, Mehdi S. M. Sajjadi, Michael Hirsch, Bernhard Scholkopf; Proceedings of the European Conference on Computer Vision (ECCV), 2018, pp. 106-122


State-of-the-art video restoration methods integrate optical flow estimation networks to utilize temporal information. However, these networks typically consider only a pair of consecutive frames and hence are not capable of capturing long-range temporal dependencies and fall short of establishing correspondences across several timesteps. To alleviate these problems, we propose a novel Spatio-temporal Transformer Network (STTN) which handles multiple frames at once and thereby manages to mitigate the common nuisance of occlusions in optical flow estimation. Our proposed STTN comprises a module that estimates optical flow in both space and time and a resampling layer that selectively warps target frames using the estimated flow. In our experiments, we demonstrate the efficiency of the proposed network and show state-of-the-art restoration results in video super-resolution and video deblurring.

Related Material

author = {Kim, Tae Hyun and Sajjadi, Mehdi S. M. and Hirsch, Michael and Scholkopf, Bernhard},
title = {Spatio-temporal Transformer Network for Video Restoration},
booktitle = {Proceedings of the European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}