Bidirectional Pyramid Networks for Semantic Segmentation

Dong Nie, Jia Xue, Xiaofeng Ren; Proceedings of the Asian Conference on Computer Vision (ACCV), 2020

Abstract


Semantic segmentation is a fundamental problem in com-puter vision that has attracted a lot of attention. Recent eorts havebeen devoted to network architecture innovations for ecient semanticsegmentation that can run in real-time for autonomous driving and otherapplications. Information ow between scales is crucial because accuratesegmentation needs both large context and ne detail. However, most ex-isting approaches still rely on pretrained backbone models (e.g. ResNeton ImageNet). In this work, we propose to open up the backbone and de-sign a simple yet eective multiscale network architecture, BidirectionalPyramid Network (BPNet). BPNet takes the shape of a pyramid: infor-mation ows from bottom (high-resolution, small receptive eld) to top(low-resolution, large receptive eld), and from top to bottom, in a sys-tematic manner, at every step of the processing. More importantly, fusionneeds to be ecient; this is done through an add-and-multiply modulewith learned weights. We also apply a unary-pairwise attention mecha-nism to balance position sensitivity and context aggregation. Auxiliaryloss is applied at multiple steps of the pyramid bottom. The resultingnetwork achieves high accuracy with eciency, without the need of pre-training. On the standard Cityscapes dataset, we achieve test mIoU 76:3with 5:1M parameters and 36 fps (on Nvidia 2080 Ti), competitive withthe state of the time real-time models. Meanwhile, our design is generaland can be used to build heavier networks: a ResNet-101 equivalent ver-sion of BPNet achieves mIoU 81.9 on Cityscapes, competitive with thebest published results. We further demonstrate the exibility of BPNeton a prostate MRI segmentation task, achieving the state of the art with a45x speed-up.

Related Material


[pdf] [supp] [code]
[bibtex]
@InProceedings{Nie_2020_ACCV, author = {Nie, Dong and Xue, Jia and Ren, Xiaofeng}, title = {Bidirectional Pyramid Networks for Semantic Segmentation}, booktitle = {Proceedings of the Asian Conference on Computer Vision (ACCV)}, month = {November}, year = {2020} }