Mining Motion Atoms and Phrases for Complex Action Recognition

Limin Wang, Yu Qiao, Xiaoou Tang; Proceedings of the IEEE International Conference on Computer Vision (ICCV), 2013, pp. 2680-2687

Abstract


This paper proposes motion atom and phrase as a midlevel temporal "part" for representing and classifying complex action. Motion atom is defined as an atomic part of action, and captures the motion information of action video in a short temporal scale. Motion phrase is a temporal composite of multiple motion atoms with an AND/OR structure, which further enhances the discriminative ability of motion atoms by incorporating temporal constraints in a longer scale. Specifically, given a set of weakly labeled action videos, we firstly design a discriminative clustering method to automatically discover a set of representative motion atoms. Then, based on these motion atoms, we mine effective motion phrases with high discriminative and representative power. We introduce a bottom-up phrase construction algorithm and a greedy selection method for this mining task. We examine the classification performance of the motion atom and phrase based representation on two complex action datasets: Olympic Sports and UCF50. Experimental results show that our method achieves superior performance over recent published methods on both datasets.

Related Material


[pdf]
[bibtex]
@InProceedings{Wang_2013_ICCV,
author = {Wang, Limin and Qiao, Yu and Tang, Xiaoou},
title = {Mining Motion Atoms and Phrases for Complex Action Recognition},
booktitle = {Proceedings of the IEEE International Conference on Computer Vision (ICCV)},
month = {December},
year = {2013}
}