Seamless Scene Segmentation

Lorenzo Porzi, Samuel Rota Bulo, Aleksander Colovic, Peter Kontschieder; The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2019, pp. 8277-8286


In this work we introduce a novel, CNN-based architecture that can be trained end-to-end to deliver seamless scene segmentation results. Our goal is to predict consistent semantic segmentation and detection results by means of a panoptic output format, going beyond the simple combination of independently trained segmentation and detection models. The proposed architecture takes advantage of a novel segmentation head that seamlessly integrates multi-scale features generated by a Feature Pyramid Network with contextual information conveyed by a light-weight DeepLab-like module. As additional contribution we review the panoptic metric and propose an alternative that overcomes its limitations when evaluating non-instance categories. Our proposed network architecture yields state-of-the-art results on three challenging street-level datasets, i.e. Cityscapes, Indian Driving Dataset and Mapillary Vistas.

Related Material

author = {Porzi, Lorenzo and Bulo, Samuel Rota and Colovic, Aleksander and Kontschieder, Peter},
title = {Seamless Scene Segmentation},
booktitle = {The IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
month = {June},
year = {2019}