The Power of Ensembles for Active Learning in Image Classification

William H. Beluch, Tim Genewein, Andreas Nürnberger, Jan M. Köhler; The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018, pp. 9368-9377

Abstract


Deep learning methods have become the de-facto standard for challenging image processing tasks such as image classification. One major hurdle of deep learning approaches is that large sets of labeled data are necessary, which can be prohibitively costly to obtain, particularly in medical image diagnosis applications. Active learning techniques can alleviate this labeling effort. In this paper we investigate some recently proposed methods for active learning with high-dimensional data and convolutional neural network classifiers. We compare ensemble-based methods against Monte-Carlo Dropout and geometric approaches. We find that ensembles perform better and lead to more calibrated predictive uncertainties, which are the basis for many active learning algorithms. To investigate why Monte-Carlo Dropout uncertainties perform worse, we explore potential differences in isolation in a series of experiments. We show results for MNIST and CIFAR-10, on which we achieve a test set accuracy of $90 %$ with roughly 12,200 labeled images, and initial results on ImageNet. Additionally, we show results on a large, highly class-imbalanced diabetic retinopathy dataset. We observe that the ensemble-based active learning effectively counteracts this imbalance during acquisition.

Related Material


[pdf] [Supp]
[bibtex]
@InProceedings{Beluch_2018_CVPR,
author = {Beluch, William H. and Genewein, Tim and Nürnberger, Andreas and Köhler, Jan M.},
title = {The Power of Ensembles for Active Learning in Image Classification},
booktitle = {The IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
month = {June},
year = {2018}
}