Fusing Semantics and Motion State Detection for Robust Visual SLAM

Gaurav Singh, Meiqing Wu, Siew-Kei Lam; Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), 2020, pp. 2764-2773

Abstract


Achieving robust pose tracking and mapping in highly dynamic environments is a major challenge faced by existing visual SLAM (vSLAM) systems. In this paper, we increase the robustness of existing vSLAM by accurately removing moving objects from the scene so that they will not contribute to pose estimation and mapping. Specifically, semantic information is fused with motion states of the scene via a probability framework to enable accurate and robust moving object extraction in order to retain the useful features for pose estimation and mapping. Our work highlights the importance of distinguishing between motion states of potential moving objects for vSLAM in highly dynamic environments. The proposed method can be integrated into existing vSLAM systems to increase their robustness in dynamic environments without incurring much computation cost. We provide extensive experimental results on three well-known datasets to show that the proposed technique outperforms existing vSLAM methods in indoor and outdoor environments, under various scenarios such as crowded scenes.

Related Material


[pdf] [video]
[bibtex]
@InProceedings{Singh_2020_WACV,
author = {Singh, Gaurav and Wu, Meiqing and Lam, Siew-Kei},
title = {Fusing Semantics and Motion State Detection for Robust Visual SLAM},
booktitle = {Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)},
month = {March},
year = {2020}
}