What Do Navigation Agents Learn About Their Environment?

Kshitij Dwivedi, Gemma Roig, Aniruddha Kembhavi, Roozbeh Mottaghi; Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022, pp. 10276-10285


Today's state of the art visual navigation agents typically consist of large deep learning architectures trained end to end. Such models offer little to no interpretability about the skills learned by the agent or the actions taken by it in response to its environment. While past works have explored interpreting deep learning models, little attention has been devoted to interpreting embodied AI systems, which often involve reasoning about the structure of the environment, target characteristics and the outcome of one's actions. In this paper, we introduce the Interpretability System for Embodied agEnts (iSEE) for Point Goal (PointNav) and Object Goal (ObjectNav) navigation models. We use iSEE to probe the dynamic representations produced by PointNav and ObjectNav agents for the presence of information about their agents location and actions, as well as the environment. We demonstrate interesting insights about navigation agents using iSEE, including the ability to encode reachable locations (to avoid obstacles), visibility of the target, progress from the initial spawn location as well as the dramatic effect on the behaviors of agents when we mask out critical individual neurons.

Related Material

[pdf] [supp]
@InProceedings{Dwivedi_2022_CVPR, author = {Dwivedi, Kshitij and Roig, Gemma and Kembhavi, Aniruddha and Mottaghi, Roozbeh}, title = {What Do Navigation Agents Learn About Their Environment?}, booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)}, month = {June}, year = {2022}, pages = {10276-10285} }