Cross-Stream Selective Networks for Action Recognition

Bowen Pan, Jiankai Sun, Wuwei Lin, Limin Wang, Weiyao Lin; Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2019, pp. 0-0


Combining multiple information streams has shown obvi- ous improvements in video action recognition. Most exist- ing works handle each stream independently or perform a simple combination on temporally simultaneous samples in multi-streams, which fails to make full use of the streamwise complementary property due to the negligence of the temporal pattern gaps among streams. In this paper, we propose a cross-stream selective network (CSN) to properly integrate and evaluate information in multi-streams. The proposed CSN first introduces a local selective-sampling module (LSM), which can find asynchronous correspondences among streams and construct high-correlated sample groups across multiple information streams. This LSM can effectively deal with the temporal dis-alignment among different streams, leading to a better integration of cross-stream information. We further introduce a global adaptive- weighting module (GAM). It adaptively evaluates the importance weights for each cross-stream sample group and selects temporally more important ones in action recognition. With the integration of cross-stream information, our GAM can obtain more reasonable importance than the existing single- stream weighting schemes. Extensive experiments on benchmark datasets of UCF101 and HMDB51 demonstrate the effectiveness of our approach over previous state-of-the-art methods.

Related Material

author = {Pan, Bowen and Sun, Jiankai and Lin, Wuwei and Wang, Limin and Lin, Weiyao},
title = {Cross-Stream Selective Networks for Action Recognition},
booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops},
month = {June},
year = {2019}