Efficient Fine-Grained Classification and Part Localization Using One Compact Network

Xiyang Dai, Ben Southall, Nhon Trinh, Bogdan Matei; Proceedings of the IEEE International Conference on Computer Vision (ICCV), 2017, pp. 996-1004

Abstract


Fine-grained classification of objects such as vehicles, natural objects and other classes is an important problem in visual recognition. A key contributor to fine-grained recognition are discriminative parts and regions of objects. We propose a novel compact multi-task network architecture that jointly optimizes both localization of parts and fine-grained class labels by learning from training data. The localization and classification sub-networks share most of the weights, yet have dedicated convolutional layers to capture finer level class specific information. We design our model as memory and computational efficient so that can be easily embedded in mobile applications. We demonstrate the effectiveness of our approach through experiments that achieve a new state-of-the-art 93.1% performance on the Stanford Cars-196 dataset, with a significantly smaller multi-task network (30M parameters) and significantly faster testing speed (78 FPS) compared to recent published results.

Related Material


[pdf]
[bibtex]
@InProceedings{Dai_2017_ICCV,
author = {Dai, Xiyang and Southall, Ben and Trinh, Nhon and Matei, Bogdan},
title = {Efficient Fine-Grained Classification and Part Localization Using One Compact Network},
booktitle = {Proceedings of the IEEE International Conference on Computer Vision (ICCV) Workshops},
month = {Oct},
year = {2017}
}