IMP: Instance Mask Projection for High Accuracy Semantic Segmentation of Things

Cheng-Yang Fu, Tamara L. Berg, Alexander C. Berg; The IEEE International Conference on Computer Vision (ICCV), 2019, pp. 5178-5187

Abstract


In this work, we present a new operator, called Instance Mask Projection (IMP), which projects a predicted instance segmentation as a new feature for semantic segmentation. It also supports back propagation and is trainable end-to end. By adding this operator, we introduce a new way to combine top-down and bottom-up information in semantic segmentation. Our experiments show the effectiveness of IMP on both clothing parsing (with complex layering, large deformations, and non-convex objects), and on street scene segmentation (with many overlapping instances and small objects). On the Varied Clothing Parsing dataset (VCP), we show instance mask projection can improve mIOU by 3 points over a state-of-the-art Panoptic FPN segmentation approach. On the ModaNet clothing parsing dataset, we show a dramatic improvement of 20.4% compared to existing baseline semantic segmentation results. In addition, the Instance Mask Projection operator works well on other (non-clothing) datasets, providing an improvement in mIOU of 3 points on "thing" classes of Cityscapes, a self-driving dataset, over a state-of-the-art approach.

Related Material


[pdf] [supp]
[bibtex]
@InProceedings{Fu_2019_ICCV,
author = {Fu, Cheng-Yang and Berg, Tamara L. and Berg, Alexander C.},
title = {IMP: Instance Mask Projection for High Accuracy Semantic Segmentation of Things},
booktitle = {The IEEE International Conference on Computer Vision (ICCV)},
month = {October},
year = {2019}
}