The Mapillary Vistas Dataset for Semantic Understanding of Street Scenes

Gerhard Neuhold, Tobias Ollmann, Samuel Rota Bulo, Peter Kontschieder; Proceedings of the IEEE International Conference on Computer Vision (ICCV), 2017, pp. 4990-4999

Abstract


The Mapillary Vistas Dataset is a novel, large-scale street-level image dataset containing 25,000 high-resolution images annotated into 66 object categories with additional, instance-specific labels for 37 classes. Annotation is performed in a dense and fine-grained style by using polygons for delineating individual objects. Our dataset is 5x larger than the total amount of fine annotations for Cityscapes and contains images from all around the world, captured at various conditions regarding weather, season and daytime. Images come from different imaging devices (mobile phones, tablets, action cameras, professional capturing rigs) and differently experienced photographers. In such a way, our dataset has been designed and compiled to cover diversity, richness of detail and geographic extent. As default benchmark tasks, we define semantic image segmentation and instance-specific image segmentation, aiming to significantly further the development of state-of-the-art methods for visual road-scene understanding.

Related Material


[pdf] [supp]
[bibtex]
@InProceedings{Neuhold_2017_ICCV,
author = {Neuhold, Gerhard and Ollmann, Tobias and Rota Bulo, Samuel and Kontschieder, Peter},
title = {The Mapillary Vistas Dataset for Semantic Understanding of Street Scenes},
booktitle = {Proceedings of the IEEE International Conference on Computer Vision (ICCV)},
month = {Oct},
year = {2017}
}