The Power of Tiling for Small Object Detection

F. Ozge Unel, Burak O. Ozkalayci, Cevahir Cigla; Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2019, pp. 0-0


Deep neural network based techniques are state-of-the-art for object detection and classification with the help of the development in computational power and memory efficiency. Although these networks are adapted for mobile platforms with sacrifice in accuracy; the resolution increase in visual sources makes the problem even harder by raising the expectations to leverage all the details in images. Real-time small object detection in low power mobile devices has been one of the fundamental problems of surveillance applications. In this study, we address the detection of pedestrians and vehicles onboard a micro aerial vehicle (MAV) with high-resolution imagery. For this purpose, we exploit PeleeNet, to our best knowledge the most efficient network model on mobile GPUs, as the backbone of an SSD network as well as 38x38 feature map in the earlier layer. After illustrating the low accuracy of state-of-the-art object detectors under the MAV scenario, we introduce a tiling based approach that is applied in both training and inference phases. The proposed technique limits the detail loss in object detection while feeding the network with a fixed size input. The improvements provided by the proposed approach are shown by in-depth experiments performed along Nvidia Jetson TX1 and TX2 using the VisDrone2018 dataset.

Related Material

author = {Ozge Unel, F. and Ozkalayci, Burak O. and Cigla, Cevahir},
title = {The Power of Tiling for Small Object Detection},
booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops},
month = {June},
year = {2019}