-
[pdf]
[bibtex]@InProceedings{Zhong_2025_CVPR, author = {Zhong, Hang and Wang, Yu and Zhao, Shengjie}, title = {SwinPaste: A Swin Transformer-Based Framework for RGB-Guided Thermal Image Super-Resolution}, booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops}, month = {June}, year = {2025}, pages = {4598-4603} }
SwinPaste: A Swin Transformer-Based Framework for RGB-Guided Thermal Image Super-Resolution
Abstract
Thermal imaging holds a pivotal role across diverse applications, yet its efficacy is constrained by the inherent low resolution of widely accessible infrared (IR) cameras. Traditional super-resolution (SR) techniques frequently encounter challenges when applied to thermal images, primarily due to their scarcity of high-frequency details. To mitigate this, guided SR techniques harness information from a high-resolution image, typically captured in the visible spectrum, to facilitate the reconstruction of a high-resolution IR image from its low-resolution input. Inspired by SwinFuSR, we propose SwinPaste, an RGB-guided thermal image super-resolution model based on the Swin Transformer. Firstly, we introduce a data mixing strategy during pre-training to enhance data diversity and improve model robustness. Furthermore, we employ multi-scale supervised signals to effectively recover high-frequency details, ensuring superior reconstruction quality. Our proposed method achieves 30.94 PSNR and 0.9201 SSIM at x8 scale, and 26.33 PSNR and 0.8593 SSIM at x16 scale on PBVS 2025 dataset, ranking the second place in Track 2 of the PBVS 2025 TISR Challenge.
Related Material