Bridging the Domain Gap for Ground-to-Aerial Image Matching

Krishna Regmi, Mubarak Shah; Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2019, pp. 470-479


The visual entities in cross-view (e.g. ground and aerial) images exhibit drastic domain changes due to the differences in viewpoints each set of images is captured from. Existing state-of-the-art methods address the problem by learning view-invariant images descriptors. We propose a novel method for solving this task by exploiting the gener- ative powers of conditional GANs to synthesize an aerial representation of a ground-level panorama query and use it to minimize the domain gap between the two views. The synthesized image being from the same view as the ref- erence (target) image, helps the network to preserve im- portant cues in aerial images following our Joint Feature Learning approach. We fuse the complementary features from a synthesized aerial image with the original ground- level panorama features to obtain a robust query represen- tation. In addition, we employ multi-scale feature aggre- gation in order to preserve image representations at dif- ferent scales useful for solving this complex task. Experi- mental results show that our proposed approach performs significantly better than the state-of-the-art methods on the challenging CVUSA dataset in terms of top-1 and top-1% retrieval accuracies. Furthermore, we evaluate the gen- eralization of the proposed method for urban landscapes on our newly collected cross-view localization dataset with geo-reference information.

Related Material

author = {Regmi, Krishna and Shah, Mubarak},
title = {Bridging the Domain Gap for Ground-to-Aerial Image Matching},
booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},
month = {October},
year = {2019}