Phantom of Benchmark Dataset: Resolving Label Ambiguity Problem on Image Recognition in the Wild

Chung, Hyunhee; Park, Kyung Ho; Seo, Taewon; Cho, Sungwoo

Hyunhee Chung, Kyung Ho Park, Taewon Seo, Sungwoo Cho; Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) Workshops, 2023, pp. 444-453

Abstract

While deep neural networks achieved supreme accomplishments in image recognition tasks, they conventionally utilize a benchmark dataset that presumes a well-designed label space where each image corresponds to a particular class; we denote these data as obvious samples. However, we claim this assumption is not always justified in the real world as well as widely-utilized ImageNet. We discover that a label ambiguity problem exists, in which several samples are inherently ambiguous and can be annotated as a particular label. In this study, we propose a series of analyses on the label ambiguity and suggest a solution to resolve it along with the following contributions. First, we define label ambiguity types that exist in conventional image recognition and publicize the corresponding datasets retrieved from ImageNet and the real world. We further reveal that this label ambiguity degrades the classification performance; thus, we justify the necessity of careful treatment of the label ambiguous samples. Second, we propose Consistent Sample Selector (CSS), a novel framework that solves this label ambiguity problem. Given obvious and ambiguous samples, the proposed CSS learns representations on each label with obvious samples and selects ambiguous samples that embrace semantics consistent with the obvious ones; thus, it aims to update the training set by concatenating obvious samples and selected ambiguous ones. Lastly, we empirically examine our CSS effectively elevates the classification performance and simultaneously improves the inductive bias, similar to how human vision recognizes.

Related Material

[pdf]

[bibtex]

@InProceedings{Chung_2023_WACV, author = {Chung, Hyunhee and Park, Kyung Ho and Seo, Taewon and Cho, Sungwoo}, title = {Phantom of Benchmark Dataset: Resolving Label Ambiguity Problem on Image Recognition in the Wild}, booktitle = {Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) Workshops}, month = {January}, year = {2023}, pages = {444-453} }