Finding Time Together: Detection and Classification of Focused Interaction in Egocentric Video

Sophia Bano, Stephen J. McKenna, Jianguo Zhang; Proceedings of the IEEE International Conference on Computer Vision (ICCV), 2017, pp. 2322-2330

Abstract


Focused interaction occurs when co-present individuals, having mutual focus of attention, interact by establishing face-to-face engagement and direct conversation. Face-to-face engagement is often not maintained throughout the entirety of a focused interaction. In this paper, we present an online method for automatic classification of unconstrained egocentric (first-person perspective) videos into segments having no focused interaction, focused interaction when the camera wearer is stationary and focused interaction when the camera wearer is moving. We extract features from both audio and video data streams and perform temporal segmentation by using support vector machines with linear and non-linear kernels. We provide empirical evidence that fusion of visual face track scores, camera motion profile and audio voice activity scores is an effective combination for focused interaction classification.

Related Material


[pdf]
[bibtex]
@InProceedings{Bano_2017_ICCV,
author = {Bano, Sophia and McKenna, Stephen J. and Zhang, Jianguo},
title = {Finding Time Together: Detection and Classification of Focused Interaction in Egocentric Video},
booktitle = {Proceedings of the IEEE International Conference on Computer Vision (ICCV) Workshops},
month = {Oct},
year = {2017}
}