Multi-Object Navigation with Dynamically Learned Neural Implicit Representations

Marza, Pierre; Matignon, Laetitia; Simonin, Olivier; Wolf, Christian

Pierre Marza, Laetitia Matignon, Olivier Simonin, Christian Wolf; Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2023, pp. 11004-11015

Abstract

Understanding and mapping a new environment are core abilities of any autonomously navigating agent. While classical robotics usually estimates maps in a stand-alone manner with SLAM variants, which maintain a topological or metric representation, end-to-end learning of navigation keeps some form of memory in a neural network. Networks are typically imbued with inductive biases, which can range from vectorial representations to birds-eye metric tensors or topological structures. In this work, we propose to structure neural networks with two neural implicit representations, which are learned dynamically during each episode and map the content of the scene: (i) the Semantic Finder predicts the position of a previously seen queried object; (ii) the Occupancy and Exploration Implicit Representation encapsulates information about explored area and obstacles, and is queried with a novel global read mechanism which directly maps from function space to a usable embedding space. Both representations are leveraged by an agent trained with Reinforcement Learning (RL) and learned online during each episode. We evaluate the agent on Multi-Object Navigation and show the high impact of using neural implicit representations as a memory source.

Related Material

[pdf] [supp] [arXiv]

[bibtex]

@InProceedings{Marza_2023_ICCV, author = {Marza, Pierre and Matignon, Laetitia and Simonin, Olivier and Wolf, Christian}, title = {Multi-Object Navigation with Dynamically Learned Neural Implicit Representations}, booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)}, month = {October}, year = {2023}, pages = {11004-11015} }