Modernized Training of U-Net for Aerial Semantic Segmentation

Jakub Straka, Ivan Gruber; Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) Workshops, 2024, pp. 776-784


In this paper, we propose an improved training protocol of U-Net architecture for the semantic segmentation of aerial images. We test our approach on the challenging FLAIR #2 dataset. We present an extensive ablation study on the influence of different approach components on the overall performance. The ablation study includes a comparison of different model backbones, image augmentations, learning rate schedulers, loss functions, and training procedures. We additionally propose a two-stage training procedure and evaluate different options for the model ensemble. Based on the results we design the final setup of the model training protocol. This final setup decreases the relative error by approximately 18% and achieves mIoU equal to 0.641, which is a new state-of-the-art result. Our code is available at:

Related Material

@InProceedings{Straka_2024_WACV, author = {Straka, Jakub and Gruber, Ivan}, title = {Modernized Training of U-Net for Aerial Semantic Segmentation}, booktitle = {Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) Workshops}, month = {January}, year = {2024}, pages = {776-784} }