From Subcategories to Visual Composites: A Multi-level Framework for Object Detection

Tian Lan, Michalis Raptis, Leonid Sigal, Greg Mori; Proceedings of the IEEE International Conference on Computer Vision (ICCV), 2013, pp. 369-376

Abstract


The appearance of an object changes profoundly with pose, camera view and interactions of the object with other objects in the scene. This makes it challenging to learn detectors based on an object-level label (e.g., "car"). We postulate that having a richer set of labelings (at different levels of granularity) for an object, including finer-grained subcategories, consistent in appearance and view, and higherorder composites contextual groupings of objects consistent in their spatial layout and appearance, can significantly alleviate these problems. However, obtaining such a rich set of annotations, including annotation of an exponentially growing set of object groupings, is simply not feasible. We propose a weakly-supervised framework for object detection where we discover subcategories and the composites automatically with only traditional object-level category labels as input. To this end, we first propose an exemplar-SVM-based clustering approach, with latent SVM refinement, that discovers a variable length set of discriminative subcategories for each object class. We then develop a structured model for object detection that captures interactions among object subcategories and automatically discovers semantically meaningful and discriminatively relevant visual composites. We show that this model produces state-of-the-art performance on UIUC phrase object detection benchmark.

Related Material


[pdf]
[bibtex]
@InProceedings{Lan_2013_ICCV,
author = {Lan, Tian and Raptis, Michalis and Sigal, Leonid and Mori, Greg},
title = {From Subcategories to Visual Composites: A Multi-level Framework for Object Detection},
booktitle = {Proceedings of the IEEE International Conference on Computer Vision (ICCV)},
month = {December},
year = {2013}
}