Box Aggregation for Proposal Decimation: Last Mile of Object Detection

Shu Liu, Cewu Lu, Jiaya Jia; Proceedings of the IEEE International Conference on Computer Vision (ICCV), 2015, pp. 2569-2577

Abstract


Regions-with-convolutional-neural-network (RCNN) is now a commonly employed object detection pipeline. Its main steps, i.e., proposal generation and convolutional neural network (CNN) feature extraction, have been intensively investigated. We focus on the last step of the system to aggregate thousands of scored box proposals into final object prediction, which we call proposal decimation. We show this step can be enhanced with a very simple box aggregation function by considering statistical properties of proposals with respect to ground truth objects. Our method is with extremely light-weight computation, while it yields an improvement of 3.7% in mAP on PASCAL VOC 2007 test. We explain why it works using some statistics in this paper.

Related Material


[pdf]
[bibtex]
@InProceedings{Liu_2015_ICCV,
author = {Liu, Shu and Lu, Cewu and Jia, Jiaya},
title = {Box Aggregation for Proposal Decimation: Last Mile of Object Detection},
booktitle = {Proceedings of the IEEE International Conference on Computer Vision (ICCV)},
month = {December},
year = {2015}
}