Fine-Grained Categorization by Alignments

E. Gavves, B. Fernando, C.G.M. Snoek, A.W.M. Smeulders, T. Tuytelaars; The IEEE International Conference on Computer Vision (ICCV), 2013, pp. 1713-1720


The aim of this paper is fine-grained categorization without human interaction. Different from prior work, which relies on detectors for specific object parts, we propose to localize distinctive details by roughly aligning the objects using just the overall shape, since implicit to fine-grained categorization is the existence of a super-class shape shared among all classes. The alignments are then used to transfer part annotations from training images to test images (supervised alignment), or to blindly yet consistently segment the object in a number of regions (unsupervised alignment). We furthermore argue that in the distinction of finegrained sub-categories, classification-oriented encodings like Fisher vectors are better suited for describing localized information than popular matching oriented features like HOG. We evaluate the method on the CU-2011 Birds and Stanford Dogs fine-grained datasets, outperforming the state-of-the-art.

Related Material

author = {Gavves, E. and Fernando, B. and Snoek, C.G.M. and Smeulders, A.W.M. and Tuytelaars, T.},
title = {Fine-Grained Categorization by Alignments},
booktitle = {The IEEE International Conference on Computer Vision (ICCV)},
month = {December},
year = {2013}