An Object Is Worth Six Thousand Pictures: The Egocentric, Manual, Multi-Image (EMMI) Dataset

Xiaohan Wang, Fernanda M. Eliott, James Ainooson, Joshua H. Palmer, Maithilee Kunda; Proceedings of the IEEE International Conference on Computer Vision (ICCV), 2017, pp. 2364-2372

Abstract


We describe a new image dataset collected to enable the study of how appearance-related and distributional properties of visual experience affect learning outcomes, called the Egocentric, Manual, Multi-Image (EMMI) dataset. Images in EMMI come from first-person, wearable camera recordings of common household objects and toys being manually manipulated to undergo structured transformations like rotation and translation. We also present results from initial experiments, using deep convolutional neural networks, that begin to examine how different distributions of training data can affect visual object recognition, and how the representation of properties like rotation invariance can be studied in novel ways using the unique properties of EMMI.

Related Material


[pdf]
[bibtex]
@InProceedings{Wang_2017_ICCV,
author = {Wang, Xiaohan and Eliott, Fernanda M. and Ainooson, James and Palmer, Joshua H. and Kunda, Maithilee},
title = {An Object Is Worth Six Thousand Pictures: The Egocentric, Manual, Multi-Image (EMMI) Dataset},
booktitle = {Proceedings of the IEEE International Conference on Computer Vision (ICCV) Workshops},
month = {Oct},
year = {2017}
}