Scale Invariant Human Action Detection From Depth Cameras Using Class Templates

Kartik Gupta, Arnav Bhavsar; Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2016, pp. 38-45

Abstract


We consider the problem of detecting and localizing a human action from continuous action video from depth cameras. We believe that this problem is more challenging than the problem of traditional action recognition as we do not have the information about the starting and ending frames of an action class. Another challenge which makes the problem difficult, is the latency in detection of actions. In this paper, we introduce a greedy approach to detect the action class, invariant of their temporal scale in the testing sequences using class templates and basic skeleton based feature representation from the depth stream data generated using Microsoft Kinect. We evaluate the proposed method on the standard G3D and UTKinect-Action datasets consisting of five and ten actions, respectively. Our results demonstrate that the proposed approach performs well for action detection and recognition under different temporal scales, and is able to outperform the state of the art methods at low latency.

Related Material


[pdf]
[bibtex]
@InProceedings{Gupta_2016_CVPR_Workshops,
author = {Gupta, Kartik and Bhavsar, Arnav},
title = {Scale Invariant Human Action Detection From Depth Cameras Using Class Templates},
booktitle = {Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops},
month = {June},
year = {2016}
}