MVFI-Net: Motion-aware Video Frame Interpolation Network

XuHu Lin, Lili Zhao, Xi Liu, Jianwen Chen; Proceedings of the Asian Conference on Computer Vision (ACCV), 2022, pp. 3690-3706


Video frame interpolation (VFI) is to synthesize the intermediate frame given successive frames. Most existing learning-based VFI methods generate each target pixel by using the warping operation with either one predicted kernel or flow, or both. However, their performances are often degraded due to the issues on the limited direction and scope of the reference regions, especially encountering complex motions. In this paper, we propose a novel motion-aware VFI network (MVFI-Net) to address these issues. One of the key novelties of our method lies in the newly developed warping operation, i.e., motion-aware convolution (MAC). By predicting multiple extensible temporal motion vectors (MVs) and filter kernels for each target pixel, the direction and scope could be enlarged simultaneously. Besides, we first attempt to incorporate the pyramid structure into the kernel-based VFI, which can decompose large motions into smaller scales to improve the prediction efficiency. The quantitative and qualitative experimental results have demonstrated the proposed method delivers the state-of-the-art performance on the diverse benchmarks with various resolutions. Our codes are available at

Related Material

[pdf] [code]
@InProceedings{Lin_2022_ACCV, author = {Lin, XuHu and Zhao, Lili and Liu, Xi and Chen, Jianwen}, title = {MVFI-Net: Motion-aware Video Frame Interpolation Network}, booktitle = {Proceedings of the Asian Conference on Computer Vision (ACCV)}, month = {December}, year = {2022}, pages = {3690-3706} }