A Hybrid Network of CNN and Transformer for Lightweight Image Super-Resolution

Jinsheng Fang, Hanjiang Lin, Xinyu Chen, Kun Zeng; Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2022, pp. 1103-1112

Abstract


Recently, a number of CNN based methods have made great progress in single image super-resolution. However, these existing architectures commonly build massive number of network layers, bringing high computational complexity and heavy memory consumption, which is inappropriate to be applied on embedded terminals such as mobile platforms. In order to solve this problem, we propose a hybrid network of CNN and Transformer (HNCT) for lightweight image super-resolution. In general, HNCT consists of four parts, which are shallow feature extraction module, Hybrid Blocks of CNN and Transformer (HBCTs), dense feature fusion module and up-sampling module, respectively. By combining CNN and Transformer, HBCT extracts deep features beneficial for super-resolution reconstruction in consideration of both local and non-local priors, while being lightweight and flexible enough. Enhanced spatial attention is introduced in HBCT to further improve performance. Extensive experimental results show our HNCT is superior to the state-of-the-art methods in terms of super-resolution performance and model complexity. Moreover, we won the second best PSNR and the least activation operations in NTIRE 2022 Efficient SR Challenge. Code is available at https://github.com/lhjthp/HNCT.

Related Material


[pdf]
[bibtex]
@InProceedings{Fang_2022_CVPR, author = {Fang, Jinsheng and Lin, Hanjiang and Chen, Xinyu and Zeng, Kun}, title = {A Hybrid Network of CNN and Transformer for Lightweight Image Super-Resolution}, booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops}, month = {June}, year = {2022}, pages = {1103-1112} }