Learning to Localise and Count With Incomplete Dot-Annotations

Feng Chen, Michael P. Pound, Andrew P. French; Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) Workshops, 2021, pp. 1612-1620


Annotating training data is a time consuming and labor intensive process in deep learning, especially for images with many objects present. In this paper, we propose a method to allow deep networks to be trained on data with reduced numbers of annotations per image in heatmap regression tasks (e.g. object localisation and counting), by applying an asymmetric loss function. This reduction of annotations can be imposed by the researchers by asking annotators to intentionally label only 50% of what they see in each image - a form of 'few-click' annotation. Our method also has a secondary benefit of counteracting unintentionally missing labels from the annotators. We conduct experiments on wheat spikelet localisation and crowd counting to assess the effectiveness and robustness of our method. Results show that an asymmetric loss function is effective across different models and datasets, even in very extreme cases with limited annotations provided (e.g. 90% of the original annotations reduced). Whilst tuning of the key parameters is required, we find that setting conservative parameter values can help more realistic situations, where only small amounts of data have been missed by annotators.

Related Material

@InProceedings{Chen_2021_ICCV, author = {Chen, Feng and Pound, Michael P. and French, Andrew P.}, title = {Learning to Localise and Count With Incomplete Dot-Annotations}, booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) Workshops}, month = {October}, year = {2021}, pages = {1612-1620} }