DeNet: Scalable Real-Time Object Detection With Directed Sparse Sampling

Lachlan Tychsen-Smith, Lars Petersson; The IEEE International Conference on Computer Vision (ICCV), 2017, pp. 428-436

Abstract


We define the object detection from imagery problem as estimating a very large but extremely sparse bounding box dependent probability distribution. Subsequently we identify a sparse distribution estimation scheme, Directed Sparse Sampling, and employ it in a single end-to-end CNN based detection model. This methodology extends and formalizes previous state-of-the-art detection models with an additional emphasis on high evaluation rates and reduced manual engineering. We introduce two novelties, a corner based region-of-interest estimator and a deconvolution based CNN model. The resulting model is scene adaptive, does not require manually defined reference bounding boxes and produces highly competitive results on MSCOCO, Pascal VOC 2007 and Pascal VOC 2012 with real-time evaluation rates. Further analysis suggests our model performs particularly well when finegrained object localization is desirable. We argue that this advantage stems from the significantly larger set of available regions-of-interest relative to other methods. Source-code is available from: https://github.com/lachlants/denet

Related Material


[pdf] [Supp] [arXiv]
[bibtex]
@InProceedings{Tychsen-Smith_2017_ICCV,
author = {Tychsen-Smith, Lachlan and Petersson, Lars},
title = {DeNet: Scalable Real-Time Object Detection With Directed Sparse Sampling},
booktitle = {The IEEE International Conference on Computer Vision (ICCV)},
month = {Oct},
year = {2017}
}