Efficient Large-Scale Semantic Visual Localization in 2D Maps

Vojir, Tomas; Budvytis, Ignas; Cipolla, Roberto

Tomas Vojir, Ignas Budvytis, Roberto Cipolla; Proceedings of the Asian Conference on Computer Vision (ACCV), 2020

Abstract

With the emergence of autonomous navigation systems, image-based localization is one of the essential tasks to be tackled. However, most of the current algorithms struggle to scale to city-size environments mainly because of the need to collect large (semi-)annotated datasets for CNN training and create databases for test environment of images, keypoint level features or image embeddings. This data acquisition is not only expensive and time-consuming but also may cause privacy concerns. In this work, we propose a novel framework for semantic visual localization in city-scale environments which alleviates the aforementioned problem by using freely available 2D maps such as OpenStreetMap. Our method does not require any images or image-map pairs for training or test environment database collection. Instead, a robust embedding is learned from a depth and building instance label information of a particular location in the 2D map. At test time, this embedding is extracted from a panoramic building instance label and depth images. It is then used to retrieve the closest match in the database.We evaluate our localization framework on two large-scale datasets consisting of Cambridge and San Francisco cities with a total length of drivable roads spanning over 500 km and including approximately 110k unique locations. To the best of our knowledge, this is the first large-scale semantic localization method which works on par with approaches that require the availability of images at train time or for test environment database creation.

Related Material

[pdf] [supp]

[bibtex]

@InProceedings{Vojir_2020_ACCV, author = {Vojir, Tomas and Budvytis, Ignas and Cipolla, Roberto}, title = {Efficient Large-Scale Semantic Visual Localization in 2D Maps}, booktitle = {Proceedings of the Asian Conference on Computer Vision (ACCV)}, month = {November}, year = {2020} }