Urban Scene Semantic Segmentation With Low-Cost Coarse Annotation

Anurag Das, Yongqin Xian, Yang He, Zeynep Akata, Bernt Schiele; Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), 2023, pp. 5978-5987


For best performance, today's semantic segmentation methods use large and carefully labeled datasets, requiring expensive annotation budgets. In this work, we show that coarse annotation is a low-cost but highly effective alternative for training semantic segmentation models. Considering the urban scene segmentation scenario, we leverage cheap coarse annotations for real-world captured data, as well as synthetic data to train our model and show competitive performance compared with fully annotated real-world data. Specifically, we propose a coarse-to fine self-training framework that generates pseudo labels for unlabeled regions of the coarsely annotated data, using synthetic data to improve predictions around the boundaries between semantic classes, and using cross-domain data augmentation to increase diversity. Our extensive experimental results on Cityscapes and BDD100k datasets demonstrate that our method achieves a significantly better performance vs annotation cost tradeoff, yielding a comparable performance to fully annotated data with only a small fraction of the annotation budget. Also, when used as pretraining, our framework performs better compared to the standard fully supervised setting.

Related Material

[pdf] [arXiv]
@InProceedings{Das_2023_WACV, author = {Das, Anurag and Xian, Yongqin and He, Yang and Akata, Zeynep and Schiele, Bernt}, title = {Urban Scene Semantic Segmentation With Low-Cost Coarse Annotation}, booktitle = {Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)}, month = {January}, year = {2023}, pages = {5978-5987} }