Mask-SLAM: Robust Feature-Based Monocular SLAM by Masking Using Semantic Segmentation

Masaya Kaneko, Kazuya Iwami, Toru Ogawa, Toshihiko Yamasaki, Kiyoharu Aizawa; Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2018, pp. 258-266

Abstract


In this paper, we propose a novel method that combines monocular visual simultaneous localization and mapping (vSLAM) and deep-learning-based semantic segmentation. For stable operation, vSLAM requires feature points on static objects. In conventional vSLAM, random sample consensus (RANSAC) is used to select those feature points. However, if a major portion of the view is occupied by moving objects, many feature points become inappropriate and RANSAC does not perform well. Based on our empirical studies, feature points in the sky and on cars often cause errors in vSLAM. We propose a new framework to exclude feature points using a mask produced by semantic segmentation. Excluding feature points in masked areas enables vSLAM to stably estimate camera motion. We apply ORB-SLAM in our framework, which is a state-of-the-art implementation of monocular vSLAM. For our experiments, we created vSLAM evaluation datasets by using the CARLA simulator under various conditions. Compared to state-of-the-art methods, our method can achieve significantly higher accuracy.

Related Material


[pdf]
[bibtex]
@InProceedings{Kaneko_2018_CVPR_Workshops,
author = {Kaneko, Masaya and Iwami, Kazuya and Ogawa, Toru and Yamasaki, Toshihiko and Aizawa, Kiyoharu},
title = {Mask-SLAM: Robust Feature-Based Monocular SLAM by Masking Using Semantic Segmentation},
booktitle = {Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops},
month = {June},
year = {2018}
}