Understanding the Nature of First-Person Videos: Characterization and Classification using Low-Level Features

Cheston Tan, Hanlin Goh, Vijay Chandrasekhar, Liyuan Li, Joo-Hwee Lim; Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2014, pp. 535-542

Abstract


First-person view (FPV) video data is set to proliferate rapidly, due to many consumer wearable-camera devices coming onto the market. Research into FPV (or "egocentric") vision is also becoming more common in the computer vision community. However, it is still unclear what the fundamental characteristics of such data are. How is it really different from third-person view (TPV) data? Can all FPV data be treated the same? In this first attempt to approach these questions in a quantitative and empirical manner, we analyzed a meta-collection of 21 FPV and TPV datasets totaling more than 165 hours of video. We performed the first quantitative characterization of FPV videos over multiple datasets, encompassing virtually all available FPV datasets. Validating this characterization, linear classifiers trained on low-level features to perform FPV-versus-TPV classification achieved good baseline performance. Accuracy peaked at 81% for 2-minute clips, but 67% accuracy was achieved even with 1-second clips. Our low-level features are fast to compute and do not require annotation. Overall, our work uncovered insights regarding the basic nature and characteristics of FPV data.

Related Material


[pdf]
[bibtex]
@InProceedings{Tan_2014_CVPR_Workshops,
author = {Tan, Cheston and Goh, Hanlin and Chandrasekhar, Vijay and Li, Liyuan and Lim, Joo-Hwee},
title = {Understanding the Nature of First-Person Videos: Characterization and Classification using Low-Level Features},
booktitle = {Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops},
month = {June},
year = {2014}
}