CASSPR: Cross Attention Single Scan Place Recognition

Yan Xia, Mariia Gladkova, Rui Wang, Qianyun Li, Uwe Stilla, João F Henriques, Daniel Cremers; Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2023, pp. 8461-8472

Abstract


Place recognition based on point clouds (LiDAR) is an important component for autonomous robots or self-driving vehicles. Current SOTA performance is achieved on accumulated LiDAR submaps using either point-based or voxel-based structures. While voxel-based approaches nicely integrate spatial context across multiple scales, they do not exhibit the local precision of point-based methods. As a result, existing methods struggle with fine-grained matching of subtle geometric features in sparse single-shot LiDAR scans. To overcome these limitations, we propose CASSPR as a method to fuse point-based and voxel-based approaches using cross attention transformers. CASSPR leverages a sparse voxel branch for extracting and aggregating information at lower resolution and a point-wise branch for obtaining fine-grained local information. CASSPR uses queries from one branch to try to match structures in the other branch, ensuring that both extract self-contained descriptors of the point cloud (rather than one branch dominating), but using both to inform the output global descriptor of the point cloud. Extensive experiments show that CASSPR surpasses the state-of-the-art by a large margin on several datasets (Oxford RobotCar, TUM, USyd). For instance, it achieves AR@1 of 85.6% on the TUM dataset, surpassing the strongest prior model by 15%. Our code is publicly available.

Related Material


[pdf] [supp] [arXiv]
[bibtex]
@InProceedings{Xia_2023_ICCV, author = {Xia, Yan and Gladkova, Mariia and Wang, Rui and Li, Qianyun and Stilla, Uwe and Henriques, Jo\~ao F and Cremers, Daniel}, title = {CASSPR: Cross Attention Single Scan Place Recognition}, booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)}, month = {October}, year = {2023}, pages = {8461-8472} }