A Hierarchical Pose-Based Approach to Complex Action Understanding Using Dictionaries of Actionlets and Motion Poselets

Ivan Lillo, Juan Carlos Niebles, Alvaro Soto; Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016, pp. 1981-1990

Abstract


In this paper, we introduce a new hierarchical model for human action recognition that is able to categorize complex actions performed in videos. Our model is also able to perform spatio-temporal annotation of the atomic actions that compose the overall complex action. That is, for each atomic action, the model generates temporal atomic action annotations by inferring the starting and ending times of the atomic action, as well spatial annotations by inferring the human body parts that are involved in each atomic action. Our model has three key properties: (i) it can be trained with no spatial supervision, as it is able to automatically discover the relevant body parts from temporal action annotations only; (ii) its jointly learned poselet and actionlet representation encodes the visual variability of actions with good generalization power; (iii) its mechanism for handling noisy body pose estimates make it robust to common pose estimation errors. We experimentally evaluate the performance of our method in multiple action recognition benchmarks. Our model consistently outperform baselines and state-of-the-art action recognition methods.

Related Material


[pdf]
[bibtex]
@InProceedings{Lillo_2016_CVPR,
author = {Lillo, Ivan and Niebles, Juan Carlos and Soto, Alvaro},
title = {A Hierarchical Pose-Based Approach to Complex Action Understanding Using Dictionaries of Actionlets and Motion Poselets},
booktitle = {Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
month = {June},
year = {2016}
}