Cascaded Local Implicit Transformer for Arbitrary-Scale Super-Resolution

Chen, Hao-Wei; Xu, Yu-Syuan; Hong, Min-Fong; Tsai, Yi-Min; Kuo, Hsien-Kai; Lee, Chun-Yi

Hao-Wei Chen, Yu-Syuan Xu, Min-Fong Hong, Yi-Min Tsai, Hsien-Kai Kuo, Chun-Yi Lee; Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2023, pp. 18257-18267

Abstract

Implicit neural representation demonstrates promising ability in representing images with arbitrary resolutions recently. In this paper, we present Local Implicit Transformer (LIT) that integrates attention mechanism and frequency encoding technique into local implicit image function. We design a cross-scale local attention block to effectively aggregate local features and a local frequency encoding block to combine positional encoding with Fourier domain information for constructing high-resolution (HR) images. To further improve representative power, we propose Cascaded LIT (CLIT) exploiting multi-scale features along with cumulative training strategy that gradually increase the upsampling factors for training. We have performed extensive experiments to validate the effectiveness of these components and analyze the variants of the training strategy. The qualitative and quantitative results demonstrated that LIT and CLIT achieve favorable results and outperform the previous works within arbitrary super-resolution tasks.

Related Material

[pdf] [arXiv]

[bibtex]

@InProceedings{Chen_2023_CVPR, author = {Chen, Hao-Wei and Xu, Yu-Syuan and Hong, Min-Fong and Tsai, Yi-Min and Kuo, Hsien-Kai and Lee, Chun-Yi}, title = {Cascaded Local Implicit Transformer for Arbitrary-Scale Super-Resolution}, booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)}, month = {June}, year = {2023}, pages = {18257-18267} }