Scene Parsing with Object Instances and Occlusion Ordering

Joseph Tighe, Marc Niethammer, Svetlana Lazebnik; Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2014, pp. 3748-3755

Abstract


This work proposes a method to interpret a scene by assigning a semantic label at every pixel and inferring the spatial extent of individual object instances together with their occlusion relationships. Starting with an initial pixel labeling and a set of candidate object masks for a given test image, we select a subset of objects that explain the image well and have valid overlap relationships and occlusion ordering. This is done by minimizing an integer quadratic program either using a greedy method or a standard solver. Then we alternate between using the object predictions to refine the pixel labels and vice versa. The proposed system obtains promising results on two challenging subsets of the LabelMe and SUN datasets, the largest of which contains 45,676 images and 232 classes.

Related Material


[pdf]
[bibtex]
@InProceedings{Tighe_2014_CVPR,
author = {Tighe, Joseph and Niethammer, Marc and Lazebnik, Svetlana},
title = {Scene Parsing with Object Instances and Occlusion Ordering},
booktitle = {Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
month = {June},
year = {2014}
}