Deep Scene Image Classification With the MFAFVNet

Yunsheng Li, Mandar Dixit, Nuno Vasconcelos; Proceedings of the IEEE International Conference on Computer Vision (ICCV), 2017, pp. 5746-5754

Abstract


The problem of transferring a deep convolutional network trained for object recognition to the task of scene image classification is considered. An embedded implementation of the recently proposed mixture of factor analyzers Fisher vector (MFA-FV) is proposed. This enables the design of a network architecture, the MFAFVNet, that can be trained in an end to end manner. The new architecture involves the design of an MFA-FV layer that implements a statistically correct version of the MFA-FV, through a combination of network computations and regularization. When compared to previous neural implementations of Fisher vectors, the MFAFVNet relies on a more powerful statistical model and a more accurate implementation. When compared to previous non-embedded models, the MFAFVNet relies on a state of the art model, which is now embedded into a CNN. This enables end to end training. Experiments show that the MFAFVNet has state of the art performance on scene classification.

Related Material


[pdf]
[bibtex]
@InProceedings{Li_2017_ICCV,
author = {Li, Yunsheng and Dixit, Mandar and Vasconcelos, Nuno},
title = {Deep Scene Image Classification With the MFAFVNet},
booktitle = {Proceedings of the IEEE International Conference on Computer Vision (ICCV)},
month = {Oct},
year = {2017}
}