Global Context-Aware Attention LSTM Networks for 3D Action Recognition

Jun Liu, Gang Wang, Ping Hu, Ling-Yu Duan, Alex C. Kot; The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017, pp. 1647-1656

Abstract


Long Short-Term Memory (LSTM) networks have shown superior performance in 3D human action recognition due to their power in modeling the dynamics and dependencies in sequential data. Since not all joints are informative for action analysis and the irrelevant joints often bring a lot of noise, we need to pay more attention to the informative ones. However, original LSTM does not have strong attention capability. Hence we propose a new class of LSTM network, Global Context-Aware Attention LSTM (GCA-LSTM), for 3D action recognition, which is able to selectively focus on the informative joints in the action sequence with the assistance of global contextual information. In order to achieve a reliable attention representation for the action sequence, we further propose a recurrent attention mechanism for our GCA-LSTM network, in which the attention performance is improved iteratively. Experiments show that our end-to-end network can reliably focus on the most informative joints in each frame of the skeleton sequence. Moreover, our network yields state-of-the-art performance on three challenging datasets for 3D action recognition.

Related Material


[pdf]
[bibtex]
@InProceedings{Liu_2017_CVPR,
author = {Liu, Jun and Wang, Gang and Hu, Ping and Duan, Ling-Yu and Kot, Alex C.},
title = {Global Context-Aware Attention LSTM Networks for 3D Action Recognition},
booktitle = {The IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
month = {July},
year = {2017}
}