Efficient Object Detection in Large Images Using Deep Reinforcement Learning

Burak Uzkent, Christopher Yeh, Stefano Ermon; The IEEE Winter Conference on Applications of Computer Vision (WACV), 2020, pp. 1824-1833

Abstract


Traditionally, an object detector is applied to every part of the scene of interest, and its accuracy and computational cost increases with higher resolution images. However, in some application domains such as remote sensing, purchasing high spatial resolution images is expensive. To reduce the large computational and monetary cost associated with using high spatial resolution images, we propose a conditional reinforcement learning agent that adaptively selects the spatial resolution of each image that is provided to the detector. In particular, we train the agent in a dual reward setting to choose low spatial resolution images to be run through a coarse level detector when the image is dominated by large objects, and high spatial resolution image to be run through a fine level detector when it is dominated by small objects. This reduces the dependency on high spatial resolution images for building a robust detector and increases run-time efficiency. We perform experiments on the xView dataset, consisting of large images, where we increase run-time efficiency by 60% and use high resolution images only 30% of the time while maintaining similar accuracy as a detector that uses only high resolution images.

Related Material


[pdf]
[bibtex]
@InProceedings{Uzkent_2020_WACV,
author = {Uzkent, Burak and Yeh, Christopher and Ermon, Stefano},
title = {Efficient Object Detection in Large Images Using Deep Reinforcement Learning},
booktitle = {The IEEE Winter Conference on Applications of Computer Vision (WACV)},
month = {March},
year = {2020}
}