Recognizing Activities via Bag of Words for Attribute Dynamics

Weixin Li, Qian Yu, Harpreet Sawhney, Nuno Vasconcelos; Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2013, pp. 2587-2594

Abstract


In this work, we propose a novel video representation for activity recognition that models video dynamics with attributes of activities. A video sequence is decomposed into short-term segments, which are characterized by the dynamics of their attributes. These segments are modeled by a dictionary of attribute dynamics templates, which are implemented by a recently introduced generative model, the binary dynamic system (BDS). We propose methods for learning a dictionary of BDSs from a training corpus, and for quantizing attribute sequences extracted from videos into these BDS codewords. This procedure produces a representation of the video as a histogram of BDS codewords, which is denoted the bag-of-words for attribute dynamics (BoWAD). An extensive experimental evaluation reveals that this representation outperforms other state-of-the-art approaches in temporal structure modeling for complex activity recognition.

Related Material


[pdf]
[bibtex]
@InProceedings{Li_2013_CVPR,
author = {Li, Weixin and Yu, Qian and Sawhney, Harpreet and Vasconcelos, Nuno},
title = {Recognizing Activities via Bag of Words for Attribute Dynamics},
booktitle = {Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
month = {June},
year = {2013}
}