Learning More Accurate Features for Semantic Segmentation in CycleNet

Linzi Qu, Lihuo He, Junjie Ke, Xinbo Gao, Wen Lu; Proceedings of the Asian Conference on Computer Vision (ACCV), 2020


Contextual information is essential for computer vision tasks, especially semantic segmentation. Previous works generally focus on how to collect contextual information by enlarging the size of receptive field, such as PSPNet, DenseASPP. In contrast to previous works, this paper proposes a new network -- CycleNet, which considers assigning a more accurate representative for every pixel. It consists of two modules, Cycle Atrous Spatial Pyramid Pooling (CycleASPP) and Alignment with Deformable Convolution (ADC). The former realizes dense connections between a series of atrous convolution layers with different dilation rates. Not only the forward connections can aggregate more contextual information, but also the backward connections can pay more attention to important information by transferring high-level features to low-level layers. Besides, ADC generates accurate information during the decoding process. It draws support from deformable convolution to select and recombine features from different blocks, thus improving the misalignment issues caused by simple interpolation. A set of experiments have been conducted on Cityscapes and ADE20K to demonstrate the effectiveness of CycleNet. In particular, our model achieved 46.14% mIoU on ADE20K validation set.

Related Material

@InProceedings{Qu_2020_ACCV, author = {Qu, Linzi and He, Lihuo and Ke, Junjie and Gao, Xinbo and Lu, Wen}, title = {Learning More Accurate Features for Semantic Segmentation in CycleNet}, booktitle = {Proceedings of the Asian Conference on Computer Vision (ACCV)}, month = {November}, year = {2020} }