Predicting an Object Location Using a Global Image Representation

Jose A. Rodriguez Serrano, Diane Larlus; The IEEE International Conference on Computer Vision (ICCV), 2013, pp. 1729-1736


We tackle the detection of prominent objects in images as a retrieval task: given a global image descriptor, we find the most similar images in an annotated dataset, and transfer the object bounding boxes. We refer to this approach as data driven detection (DDD), that is an alternative to sliding windows. Previous works have used similar notions but with task-independent similarities and representations, i.e. they were not tailored to the end-goal of localization. This article proposes two contributions: (i) a metric learning algorithm and (ii) a representation of images as object probability maps, that are both optimized for detection. We show experimentally that these two contributions are crucial to DDD, do not require costly additional operations, and in some cases yield comparable or better results than state-of-the-art detectors despite conceptual simplicity and increased speed. As an application of prominent object detection, we improve fine-grained categorization by precropping images with the proposed approach.

Related Material

author = {Rodriguez Serrano, Jose A. and Larlus, Diane},
title = {Predicting an Object Location Using a Global Image Representation},
booktitle = {The IEEE International Conference on Computer Vision (ICCV)},
month = {December},
year = {2013}