ProcSy: Procedural Synthetic Dataset Generation Towards Influence Factor Studies Of Semantic Segmentation Networks

Samin Khan, Buu Phan, Rick Salay, Krzysztof Czarnecki; Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2019, pp. 88-96

Abstract


Real-world, large-scale semantic segmentation datasets are expensive and time-consuming to create. Thus, the research community has explored the use of video game worlds and simulator environments to produce large-scale synthetic datasets, mainly to supplement the real-world ones for training deep neural networks. Another use of synthetic datasets is to enable highly controlled and repeatable experiments, thanks to the ability to manipulate the content and rendering of synthesized imagery. To this end, we outline a method to generate an arbitrarily large, semantic segmentation dataset reflecting real-world features, while minimizing required cost and man-hours. We demonstrate its use by generating ProcSy, a synthetic dataset for semantic segmentation, which is modeled on a real-world urban environment and features a range of variable influence factors, such as weather and lighting. Our experiments investigate impact of the factors on performance of a state-of-the-art deep network. Among others, we show that including as little as 3% of rainy images in the training set, improved the mIoU of the network on rainy images by about 10%, while training with more than 15% rainy images has diminishing returns. We provide ProcSy dataset, along with generated 3D assets and code, as supplementary material.

Related Material


[pdf]
[bibtex]
@InProceedings{Khan_2019_CVPR_Workshops,
author = {Khan, Samin and Phan, Buu and Salay, Rick and Czarnecki, Krzysztof},
title = {ProcSy: Procedural Synthetic Dataset Generation Towards Influence Factor Studies Of Semantic Segmentation Networks},
booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops},
month = {June},
year = {2019}
}