Multi-Frequency Representation Enhancement with Privilege Information for Video Super-Resolution

Fei Li, Linfeng Zhang, Zikun Liu, Juan Lei, Zhenbo Li; Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2023, pp. 12814-12825

Abstract


CNN's limited receptive field restricts its ability to capture long-range spatial-temporal dependencies, leading to unsatisfactory performance in video super-resolution. To tackle this challenge, this paper presents a novel multi-frequency representation enhancement module (MFE) that performs spatial-temporal information aggregation in the frequency domain. Specifically, MFE mainly includes a spatial-frequency representation enhancement branch which captures the long-range dependency in the spatial dimension, and an energy frequency representation enhancement branch to obtain the inter-channel feature relationship. Moreover, a novel model training method named privilege training is proposed to encode the privilege information from high-resolution videos to facilitate model training. With these two methods, we introduce a new VSR model named MFPI, which outperforms state-of-the-art methods by a large margin while maintaining good efficiency on various datasets, including REDS4, Vimeo, Vid4, and UDM10.

Related Material


[pdf] [supp]
[bibtex]
@InProceedings{Li_2023_ICCV, author = {Li, Fei and Zhang, Linfeng and Liu, Zikun and Lei, Juan and Li, Zhenbo}, title = {Multi-Frequency Representation Enhancement with Privilege Information for Video Super-Resolution}, booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)}, month = {October}, year = {2023}, pages = {12814-12825} }