Can We Trust Bounding Box Annotations for Object Detection?
Object detection is a classical problem in computer vision, and the vast majority of approaches require large annotated datasets for training and evaluation purposes. The most popular representations are bounding boxes (BBs), usually defined as the minimal-area rectangle that encompasses the whole object region. However, the annotation process presents some subjectiveness (particularly when occlusions are present), and its quality might get degraded when the annotators get tired. Comparing BBs is crucial for evaluation purposes, and the Intersection-over-Union (IoU) is the standard similarity metric. In this paper, we provide theoretical and experimental results indicating that the IoU can be strongly affected even by small annotation discrepancies in popular datasets used for object detection. As a consequence, the Average Precision (AP) value commonly used to evaluate object detectors is also influenced by annotation bias or noise, particularly for small objects and tighter IoU thresholds.