Grafit: Learning Fine-Grained Image Representations With Coarse Labels

Hugo Touvron, Alexandre Sablayrolles, Matthijs Douze, Matthieu Cord, Hervé Jégou; Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2021, pp. 874-884

Abstract


This paper tackles the problem of learning a finer representation than the one provided by training labels. This enables fine-grained category retrieval of images in a collection annotated with coarse labels only. Our network is learned with a nearest-neighbor classifier objective, and an instance loss inspired by self-supervised learning. By jointly leveraging the coarse labels and the underlying fine-grained latent space, it significantly improves the accuracy of category-level retrieval methods. Our strategy outperforms all competing methods for retrieving or classifying images at a finer granularity than that available at train time. It also improves the accuracy for transfer learning tasks to fine-grained datasets.

Related Material


[pdf] [supp] [arXiv]
[bibtex]
@InProceedings{Touvron_2021_ICCV, author = {Touvron, Hugo and Sablayrolles, Alexandre and Douze, Matthijs and Cord, Matthieu and J\'egou, Herv\'e}, title = {Grafit: Learning Fine-Grained Image Representations With Coarse Labels}, booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)}, month = {October}, year = {2021}, pages = {874-884} }