The Best of Both Worlds: Combining Data-Independent and Data-Driven Approaches for Action Recognition

Zhenzhong Lan, Shoou-I Yu, Dezhong Yao, Ming Lin, Bhiksha Raj, Alexander Hauptmann; Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2016, pp. 123-132

Abstract


Motivated by the success of CNNs in object recognition on images, researchers are striving to develop CNN equivalents for learning video features. However, learning video features globally has proven to be quite a challenge due to the difficulty of getting enough labels, processing large-scale video data, and representing motion information. Therefore, we propose to leverage effective techniques from both data-driven and data-independent approaches to improve action recognition system. Our contribution is three-fold. First, we explicitly show that local handcrafted features and CNNs share the same convolution-pooling network structure. Second, we propose to use independent subspace analysis (ISA) to learn descriptors for state-of-the-art handcrafted features. Third, we enhance ISA with two new improvements, which make our learned descriptors significantly outperform the handcrafted ones. Experimental results on standard action recognition benchmarks show competitive performance.

Related Material


[pdf]
[bibtex]
@InProceedings{Lan_2016_CVPR_Workshops,
author = {Lan, Zhenzhong and Yu, Shoou-I and Yao, Dezhong and Lin, Ming and Raj, Bhiksha and Hauptmann, Alexander},
title = {The Best of Both Worlds: Combining Data-Independent and Data-Driven Approaches for Action Recognition},
booktitle = {Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops},
month = {June},
year = {2016}
}