Self-Improving Classification Performance Through GAN Distillation

Matteo Pennisi, Simone Palazzo, Concetto Spampinato; Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) Workshops, 2021, pp. 1640-1648


The availability of a large dataset can be a key factor in achieving good generalization capabilities when training deep learning models. Unfortunately, dataset collection is an expensive and time-consuming task, especially in specific application domains (e.g., medicine). In this paper, we present an approach for overcoming dataset size limitations by combining a classifier with a generative adversarial network (GAN) trained to synthesize ""hard"" samples through a triplet loss, to encourage the model to learn class features which may be under-represented or ambiguous in a small dataset. We evaluate the proposed approach on subsets of CIFAR-10 in order to simulate a low data availability, and compare the results achieved by our method with those obtained when training in a standard supervised setting over the same reduced set of data. Performance analysis shows a significant improvement in accuracy when training the model on GAN-generated hard samples: our GAN distillation approach improves accuracy in the reduced dataset scenario by about 5 percent points, compared to standard supervised training. Ablation studies and feature visualization confirm that our generative approach is able to consistently produce synthetic images that allow the model to improve its performance even with low data availability.

Related Material

@InProceedings{Pennisi_2021_ICCV, author = {Pennisi, Matteo and Palazzo, Simone and Spampinato, Concetto}, title = {Self-Improving Classification Performance Through GAN Distillation}, booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) Workshops}, month = {October}, year = {2021}, pages = {1640-1648} }