MvAV-pix2pixHD: Multi-view Aerial View Image Translation

Jun Yu, Keda Lu, Shenshen Du, Lin Xu, Peng Chang, Houde Liu, Bin Lan, Tianyu Liu; Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2024, pp. 3066-3075

Abstract


Multi-modal aerial view image translation involves converting aerial images from one modality to another while preserving basic details and features. These modalities encompass Synthetic Aperture Radar (SAR) Infrared (IR) Visible Light (RGB) Electro-Optical (EO) and other image types. Recently various methods have been proposed to tackle this task but the focus tends to be on paired image research overlooking the discrepancies found in aerial images of the same location captured at different times and angles termed incomplete matching or multi-view image translation. Consequently we propose MvAV-pix2pixHD to address this issue. For multi-view data sampling we propose two methods: random sampling and time-priority sampling. Additionally within the pix2pixHD framework we introduce an inverse generator to ensure the basic semantic features of the generated images and incorporate three robust loss functions to constrain the authenticity of the generated images. We conduct extensive experiments on two multi-view image translation tasks in the Multi-modal Aerial View Imagery Challenge: Translation (MAVIC-T). Experimental results demonstrate the superiority of our proposed method and we achieved second place in the MAVIC-T competition in the 20th IEEE Workshop on Perception Beyond the Visible Spectrum of the CVPR 2024.

Related Material


[pdf]
[bibtex]
@InProceedings{Yu_2024_CVPR, author = {Yu, Jun and Lu, Keda and Du, Shenshen and Xu, Lin and Chang, Peng and Liu, Houde and Lan, Bin and Liu, Tianyu}, title = {MvAV-pix2pixHD: Multi-view Aerial View Image Translation}, booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops}, month = {June}, year = {2024}, pages = {3066-3075} }