Localizing Grouped Instances for Efficient Detection in Low-Resource Scenarios

Amelie Royer, Christoph Lampert; Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), 2020, pp. 1727-1736

Abstract


State-of-the-art detection systems generally focus, and are evaluated on, their ability to exhaustively retrieve objects densely distributed in the image, across a wide variety of appearances and semantic categories. Orthogonal to this, many practical object detection applications, for example in remote sensing, instead require dealing with large images that contain only a few small objects of a single class, scattered heterogeneously across the space. In addition, they are often subject to strict computational constraints, such as limited battery capacity and computing power. To tackle these more practical scenarios, we propose a novel detection scheme that offers a flexible and efficient framework for detection tasks with variable object sizes and densities: We rely on a sequence of detection stages, each of which has the ability to predict groups of objects as well as individuals. Similar to a detection cascade, this multi-stage architecture spares computational effort by discarding large irrelevant regions of the image early during the detection process. The ability to group objects provides further computational and memory savings, as it allows working with lower image resolutions in early stages, where groups are more easily detected than individuals. We report experimental results on two aerial image datasets, and show that the proposed method is as accurate yet computationally more efficient than standard single-shot detectors, consistently across three different backbone architectures.

Related Material


[pdf] [supp] [video]
[bibtex]
@InProceedings{Royer_2020_WACV,
author = {Royer, Amelie and Lampert, Christoph},
title = {Localizing Grouped Instances for Efficient Detection in Low-Resource Scenarios},
booktitle = {Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)},
month = {March},
year = {2020}
}