Ladder-Style DenseNets for Semantic Segmentation of Large Natural Images

Ivan Kreso, Sinisa Segvic, Josip Krapac; Proceedings of the IEEE International Conference on Computer Vision (ICCV), 2017, pp. 238-245

Abstract


Recent progress of deep image classification models provides a large potential to improve state-of-the-art performance in the related computer vision tasks. However, the transition towards semantic segmentation is not straight-forward due to strict memory limitations of contemporary GPU cards. The extent of feature map caching required by convolutional backprop poses significant challenges even for moderately sized PASCAL images, while requiring careful architectural considerations when the source resolution is in the megapixel range. To address these concerns we propose a DenseNet-based ladder-style architecture which features a lean representation near the original resolution. The resulting fully convolutional models have few parameters, allow training at megapixel resolution on commodity hardware and display fair semantic segmentation performance even without ImageNet pre-training. We present experiments on Cityscapes and Pascal VOC 2012 datasets and report competitive results.

Related Material


[pdf]
[bibtex]
@InProceedings{Kreso_2017_ICCV,
author = {Kreso, Ivan and Segvic, Sinisa and Krapac, Josip},
title = {Ladder-Style DenseNets for Semantic Segmentation of Large Natural Images},
booktitle = {Proceedings of the IEEE International Conference on Computer Vision (ICCV) Workshops},
month = {Oct},
year = {2017}
}