Single Pair Cross-Modality Super Resolution

Guy Shacht, Dov Danon, Sharon Fogel, Daniel Cohen-Or; Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2021, pp. 6378-6387


Non-visual imaging sensors are widely used in the industry for different purposes. Those sensors are more expensive than visual (RGB) sensors, and usually produce images with lower resolution. To this end, Cross-Modality Super-Resolution methods were introduced, where an RGB image of a high-resolution assists in increasing the resolution of a low-resolution modality. However, fusing images from different modalities is not a trivial task, since each multi-modal pair varies greatly in its internal correlations. For this reason, traditional state-of-the-arts which are trained on external datasets often struggle with yielding an artifact-free result that is still loyal to the target modality characteristics. We present CMSR, a single-pair approach for Cross-Modality Super-Resolution. The network is internally trained on the two input images only, in a self-supervised manner, learns their internal statistics and correlations, and applies them to upsample the target modality. CMSR contains an internal transformer which is trained on-the-fly together with the up-sampling process itself and without supervision, to allow dealing with pairs that are only weakly aligned. We show that CMSR produces state-of-the-art super resolved images, yet without introducing artifacts or irrelevant details that originate from the RGB image only.

Related Material

[pdf] [supp] [arXiv]
@InProceedings{Shacht_2021_CVPR, author = {Shacht, Guy and Danon, Dov and Fogel, Sharon and Cohen-Or, Daniel}, title = {Single Pair Cross-Modality Super Resolution}, booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)}, month = {June}, year = {2021}, pages = {6378-6387} }