- [pdf] [supp]
Dual-Stream Fusion Network for Spatiotemporal Video Super-Resolution
Upsampling toward visual data has long been an important research topic for improving the perceptual quality and benefiting various computer vision applications. In recent years, we have witnessed remarkable progresses brought by the renaissance of deep learning techniques for video or image super-resolution. However, most existing works focus on advancing super-resolution at either spatial or temporal direction, i.e, to increase the spatial resolution or video frame rate. In this paper, we instead turn to discuss both directions jointly and tackle the spatiotemporal upsampling problem. Our method is based on an important observation that: even the direct cascade of prior researches in spatial and temporal super-resolution can achieve the spatiotemporal upsampling, different orders for combining them will lead to results with a complementary property. Thus, we propose a dual-stream fusion network to adaptively fuse the intermediate results produced by two spatiotemporal upsampling streams, where the first stream applies the spatial super-resolution followed by the temporal super-resolution, while the second one is with the reverse order of cascade. Extensive experiments verify the efficacy of the proposed model and its superior performance with respect to several baselines. Moreover, the investigation on utilizing various spatial and temporal upsampling methods as the basis in our two streams well demonstrates the flexibility and wide applicability of the proposed framework.