Discovering Human Interactions in Videos With Limited Data Labeling

Mehran Khodabandeh, Arash Vahdat, Guang-Tong Zhou, Hossein Hajimirsadeghi, Mehrsan Javan Roshtkhari, Greg Mori, Stephen Se; Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2015, pp. 9-18

Abstract


We present a novel approach for discovering human interactions in videos. Activity understanding techniques usually require a large number of labeled examples, which are not available in many practical cases. Here, we focus on recovering semantically meaningful clusters of human-human and human-object interaction in an unsupervised fashion. A new iterative solution is introduced based on Maximum Margin Clustering (MMC), which also accepts user feedback to refine clusters. This is achieved by formulating the whole process as a unified constrained latent max-margin clustering problem. Extensive experiments have been carried out over three challenging datasets, Collective Activity, VIRAT, and UT-interaction. Empirical results demonstrate that the proposed algorithm can efficiently discover perfect semantic clusters of human interactions with only a small amount of labeling effort.

Related Material


[pdf]
[bibtex]
@InProceedings{Khodabandeh_2015_CVPR_Workshops,
author = {Khodabandeh, Mehran and Vahdat, Arash and Zhou, Guang-Tong and Hajimirsadeghi, Hossein and Javan Roshtkhari, Mehrsan and Mori, Greg and Se, Stephen},
title = {Discovering Human Interactions in Videos With Limited Data Labeling},
booktitle = {Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops},
month = {June},
year = {2015}
}