Cascaded Zoom-In Detector for High Resolution Aerial Images

Akhil Meethal, Eric Granger, Marco Pedersoli; Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2023, pp. 2046-2055


Detecting objects in aerial images is challenging because they are typically composed of crowded small objects distributed non-uniformly over high-resolution (in terms of pixel size) images. Density cropping is a widely used method to improve this small object detection where the crowded small object regions are extracted and processed in high image-resolution. However, this is typically accomplished by adding other learnable components, thus complicating the training and inference over a standard detection process. In this paper, we propose an efficient Cascaded Zoom-in (CZ) detector that re-purposes the detector itself for density-guided training and inference. During training, density crops are located, labeled as a new class, and employed to augment the training dataset. During inference, the density crops are first detected along with the base class objects, and then input for a second stage of inference. This approach is easily integrated into any detector, and creates no significant change in the standard detection process, like the uniform cropping approach popular in aerial image detection. Experimental results on the aerial images of the challenging VisDrone and DOTA datasets verify the benefits of the proposed approach. The proposed CZ detector also provides state-of-the-art results over uniform cropping and other density cropping methods on the VisDrone dataset, increasing the detection mAP of small objects by more than 3 percentage points.

Related Material

[pdf] [supp] [arXiv]
@InProceedings{Meethal_2023_CVPR, author = {Meethal, Akhil and Granger, Eric and Pedersoli, Marco}, title = {Cascaded Zoom-In Detector for High Resolution Aerial Images}, booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops}, month = {June}, year = {2023}, pages = {2046-2055} }