Ground-Based Activity Recognition at Distance and Behind Wall

Tao Wang, Riad Hammoud, Zhigang Zhu; Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2014, pp. 231-236

Abstract


Long-range activity recognition is a challenging research problem in a surveillance area where sensors cannot be placed close to targets. Even a simple activity can be confused with other activities or not be recognized correctly if the detection in one of the sensor modalities is not certain or even unavailable. Also, the training of some real-life activities is not feasible, because it is hard to collect sufficient and accurate labeled data for varieties of free-living activities. In this paper, we use an unsupervised learning algorithm, Dirichlet process Gaussian mixture model (DPGMM), to construct a model to determine the number of classes automatically. To further represent a set of features as one event, and communicate between both audio and video, we use the DPGMM as a base and enhance it with additional aggregation, multimodal association and transition. This new model is called aggregation coupled Dirichlet process Gaussian mixture model (AC-DPGMM). We present experiments with some activities that cannot be simply distinguished using visual features only. Along with audio information, we can also recognize some activities invisible in video, such as speaking behind a wall. We compared our model with a generative clustering algorithm and the original DPGMM, and showed that we have 23.6% and 18.8% improvement in accuracy compared with manually labeled data.

Related Material


[pdf]
[bibtex]
@InProceedings{Wang_2014_CVPR_Workshops,
author = {Wang, Tao and Hammoud, Riad and Zhu, Zhigang},
title = {Ground-Based Activity Recognition at Distance and Behind Wall},
booktitle = {Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops},
month = {June},
year = {2014}
}