Hybrid Cross-View Attention Network for Lightweight Stereo Image Super-Resolution

Yuqiang Yang, Zhiming Zhang, Yao Du, Jingjing Yang, Long Bao, Heng Sun; Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2024, pp. 6055-6064

Abstract


The goal of stereo image super-resolution is to enhance the quality of low-resolution stereo image pairs by utilizing complementary information across views. Although transformer-based methods have shown high efficiency in single-image super-resolution tasks they have not been fully used in stereo super-resolution tasks. Therefore it is crucial to incorporate the complementary information of stereo images into the transformer method to improve image details. To address this challenge we propose a lightweight Hybrid Cross-view Attention Stereo Super-Resolution network (HCASSR) which uses a Transformer-based network for intra-view feature extraction and a cross-view attention module to aggregate stereo image information. We also employ multi-stage training strategies and data ensemble in test-time to improve image quality. Our method has been extensively tested on the KITTI 2012 KITTI 2015 Middlebury and Flickr1024 datasets and the experimental results demonstrate that the proposed method outperforms existing works with smaller model size. Additionally we won 3rd and 2nd place respectively in Track 1 and Track 2 of the NTIRE 2024 Stereo Image Super-Resolution Challenge. Codes and models will be released at https://github.com/YuqiangY/HCASSR https://github.com/YuqiangY/HCASSR.

Related Material


[pdf]
[bibtex]
@InProceedings{Yang_2024_CVPR, author = {Yang, Yuqiang and Zhang, Zhiming and Du, Yao and Yang, Jingjing and Bao, Long and Sun, Heng}, title = {Hybrid Cross-View Attention Network for Lightweight Stereo Image Super-Resolution}, booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops}, month = {June}, year = {2024}, pages = {6055-6064} }