EigenPlaces: Training Viewpoint Robust Models for Visual Place Recognition

Gabriele Berton, Gabriele Trivigno, Barbara Caputo, Carlo Masone; Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2023, pp. 11080-11090

Abstract


Visual Place Recognition is a task that aims to predict the place of an image (called query) based solely on its visual features. This is typically done through image retrieval, where the query is matched to the most similar images from a large database of geotagged photos, using learned global descriptors. A major challenge in this task is recognizing places seen from different viewpoints. To overcome this limitation, we propose a new method, called EigenPlaces, to train our neural network on images from different point of views, which embeds viewpoint robustness into the learned global descriptors. The underlying idea is to cluster the training data so as to explicitly present the model with different views of the same points of interest. The selection of this points of interest is done without the need for extra supervision. We then present experiments on the most comprehensive set of datasets in literature, finding that EigenPlaces is able to outperform previous state of the art on the majority of datasets, while requiring 60% less GPU memory for training and using 50% smaller descriptors. The code and trained models for EigenPlaces are available at https://github.com/gmberton/EigenPlaces, while results with any other baseline can be computed with the codebase at https://github.com/gmberton/auto_VPR.

Related Material


[pdf] [supp] [arXiv]
[bibtex]
@InProceedings{Berton_2023_ICCV, author = {Berton, Gabriele and Trivigno, Gabriele and Caputo, Barbara and Masone, Carlo}, title = {EigenPlaces: Training Viewpoint Robust Models for Visual Place Recognition}, booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)}, month = {October}, year = {2023}, pages = {11080-11090} }