SkyScapes Fine-Grained Semantic Understanding of Aerial Scenes

Seyed Majid Azimi, Corentin Henry, Lars Sommer, Arne Schumann, Eleonora Vig; Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2019, pp. 7393-7403


Understanding the complex urban infrastructure with centimeter-level accuracy is essential for many applications from autonomous driving to mapping, infrastructure monitoring, and urban management. Aerial images provide valuable information over a large area instantaneously; nevertheless, no current dataset captures the complexity of aerial scenes at the level of granularity required by real-world applications. To address this, we introduce SkyScapes, an aerial image dataset with highly-accurate, fine-grained annotations for pixel-level semantic labeling. SkyScapes provides annotations for 31 semantic categories ranging from large structures, such as buildings, roads and vegetation, to fine details, such as 12 (sub-)categories of lane markings. We have defined two main tasks on this dataset: dense semantic segmentation and multi-class lane-marking prediction. We carry out extensive experiments to evaluate state-of-the-art segmentation methods on SkyScapes. Existing methods struggle to deal with the wide range of classes, object sizes, scales, and fine details present. We therefore propose a novel multi-task model, which incorporates semantic edge detection and is better tuned for feature extraction from a wide range of scales. This model achieves notable improvements over the baselines in region outlines and level of detail on both tasks.

Related Material

[pdf] [supp]
author = {Azimi, Seyed Majid and Henry, Corentin and Sommer, Lars and Schumann, Arne and Vig, Eleonora},
title = {SkyScapes Fine-Grained Semantic Understanding of Aerial Scenes},
booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},
month = {October},
year = {2019}