DFANet: Deep Feature Aggregation for Real-Time Semantic Segmentation

Hanchao Li, Pengfei Xiong, Haoqiang Fan, Jian Sun; The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2019, pp. 9522-9531

Abstract


This paper introduces an extremely efficient CNN architecture named DFANet for semantic segmentation under resource constraints. Our proposed network starts from a single lightweight backbone and aggregates discriminative features through sub-network and sub-stage cascade respectively. Based on the multi-scale feature propagation, DFANet substantially reduces the number of parameters, but still obtains sufficient receptive field and enhances the model learning ability, which strikes a balance between the speed and segmentation performance. Experiments on Cityscapes and CamVid datasets demonstrate the superior performance of DFANet with 8xless FLOPs and 2xfaster than the existing state-of-the-art real-time semantic segmentation methods while providing comparable accuracy. Specifically, it achieves 70.3% Mean IOU on the Cityscapes test dataset with only 1.7 GFLOPs and a speed of 160 FPS on one NVIDIA Titan X card, and 71.3% Mean IOU with 3.4 GFLOPs while inferring on a higher resolution image.

Related Material


[pdf]
[bibtex]
@InProceedings{Li_2019_CVPR,
author = {Li, Hanchao and Xiong, Pengfei and Fan, Haoqiang and Sun, Jian},
title = {DFANet: Deep Feature Aggregation for Real-Time Semantic Segmentation},
booktitle = {The IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
month = {June},
year = {2019}
}