3D R Transform on Spatio-temporal Interest Points for Action Recognition

Chunfeng Yuan, Xi Li, Weiming Hu, Haibin Ling, Stephen Maybank; Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2013, pp. 724-730

Abstract


Spatio-temporal interest points serve as an elementary building block in many modern action recognition algorithms, and most of them exploit the local spatio-temporal volume features using a Bag of Visual Words (BOVW) representation. Such representation, however, ignores potentially valuable information about the global spatio-temporal distribution of interest points. In this paper, we propose a new global feature to capture the detailed geometrical distribution of interest points. It is calculated by using the R transform which is defined as an extended 3D discrete Radon transform, followed by applying a two-directional two-dimensional principal component analysis. Such R feature captures the geometrical information of the interest points and keeps invariant to geometry transformation and robust to noise. In addition, we propose a new fusion strategy to combine the R feature with the BOVW representation for further improving recognition accuracy. We utilize a context-aware fusion method to capture both the pairwise similarities and higher-order contextual interactions of the videos. Experimental results on several publicly available datasets demonstrate the effectiveness of the proposed approach for action recognition.

Related Material


[pdf]
[bibtex]
@InProceedings{Yuan_2013_CVPR,
author = {Yuan, Chunfeng and Li, Xi and Hu, Weiming and Ling, Haibin and Maybank, Stephen},
title = {3D R Transform on Spatio-temporal Interest Points for Action Recognition},
booktitle = {Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
month = {June},
year = {2013}
}