Lightweight Image Super-Resolution with Superpixel Token Interaction

Aiping Zhang, Wenqi Ren, Yi Liu, Xiaochun Cao; Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2023, pp. 12728-12737


Transformer-based methods have demonstrated impressive results on single-image super-resolution (SISR) task. However, self-attention mechanism is computationally expensive when applied to the entire image. As a result, current approaches divide low-resolution input images into small patches, which are processed separately and then fused to generate high-resolution images. Nevertheless, this conventional regular patch division is too coarse and lacks interpretability, resulting in artifacts and non-similar structure interference during attention operations. To address these challenges, we propose a novel super token interaction network (SPIN). Our method employs superpixels to cluster local similar pixels to form the explicable local regions and utilizes intra-superpixel attention to enable local information interaction. It is interpretable because only similar regions complement each other and dissimilar regions are excluded. Moreover, we design a superpixel cross-attention module to facilitate information propagation via the surrogation of superpixels. Extensive experiments demonstrate that the proposed SPIN model performs favorably against the state-of-the-art SR methods in terms of accuracy and lightweight. Code is available at

Related Material

@InProceedings{Zhang_2023_ICCV, author = {Zhang, Aiping and Ren, Wenqi and Liu, Yi and Cao, Xiaochun}, title = {Lightweight Image Super-Resolution with Superpixel Token Interaction}, booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)}, month = {October}, year = {2023}, pages = {12728-12737} }