Boxy Vehicle Detection in Large Images

Karsten Behrendt; Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2019, pp. 0-0


Camera-based object detection and automated driving in general have greatly improved over the last few years. Parts of these improvements can be attributed to public datasets which allow researchers around the world to work with data that would often be too expensive to collect and annotate for individual teams. Current vehicle detection datasets and approaches often focus on axis-aligned bounding boxes or semantic segmentation. Axis-aligned bounding boxes often misrepresent vehicle sizes and may intrude into neighboring lanes. While pixel level segmentations are more accurate, they can be hard to process and leverage for trajectory planning systems. We therefore present the Boxy dataset for image-based vehicle detection. Boxy is one of the largest public vehicle detection datasets with 1.99 million annotated vehicles in 200,000 images, including sunny, rainy, and nighttime driving. If possible, vehicle annotations are split into their visible sides to give the impression of 3D boxes for a more accurate representation with little overhead. Five megapixel images with annotations down to a few pixels make this dataset especially challenging. With Boxy, we provide initial benchmark challenges for bounding box, polygon, and real-time detections. All benchmarks are open-source so that additional metrics and benchmarks may be added at

Related Material

author = {Behrendt, Karsten},
title = {Boxy Vehicle Detection in Large Images},
booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) Workshops},
month = {Oct},
year = {2019}