Pose Invariant Topological Memory for Visual Navigation

Asuto Taniguchi, Fumihiro Sasaki, Ryota Yamashina; Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2021, pp. 15384-15393

Abstract


Planning for visual navigation using topological memory, a memory graph consisting of nodes and edges, has been recently well-studied. The nodes correspond to past observations of a robot, and the edges represent the reachability predicted by a neural network (NN). Most prior methods, however, often fail to predict the reachability when the robot takes different poses, i.e. the direction the robot faces, at close positions. This is because the methods observe first-person view images, which significantly changes when the robot changes its pose, and thus it is fundamentally difficult to correctly predict the reachability from them. In this paper, we propose pose invariant topological memory (POINT) to address the problem. POINT observes omnidirectional images and predicts the reachability by using a spherical convolutional NN, which has a rotation invariance property and enables planning regardless of the robot's pose. Additionally, we train the NN by contrastive learning with data augmentation to enable POINT to plan with robustness to changes in environmental conditions, such as light conditions and the presence of unseen objects. Our experimental results show that POINT outperforms conventional methods under both the same and different environmental conditions. In addition, the results with the KITTI-360 dataset show that POINT is more applicable to real-world environments than conventional methods.

Related Material


[pdf] [supp]
[bibtex]
@InProceedings{Taniguchi_2021_ICCV, author = {Taniguchi, Asuto and Sasaki, Fumihiro and Yamashina, Ryota}, title = {Pose Invariant Topological Memory for Visual Navigation}, booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)}, month = {October}, year = {2021}, pages = {15384-15393} }