JOLO-GCN: Mining Joint-Centered Light-Weight Information for Skeleton-Based Action Recognition

Jinmiao Cai, Nianjuan Jiang, Xiaoguang Han, Kui Jia, Jiangbo Lu; Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), 2021, pp. 2735-2744

Abstract


Skeleton-based action recognition has attracted research attentions in recent years. One common drawback in currently popular skeleton-based human action recognition methods is that the sparse skeleton information alone is not sufficient to fully characterize human motion. This limitation makes several existing methods incapable of correctly classifying action categories which exhibit only subtle motion differences. In this paper, we propose a novel framework for employing human pose skeleton and joint-centered light-weight information jointly in a two-stream graph convolutional network, namely, JOLO-GCN. Specifically, we use Joint-aligned optical Flow Patches (JFP) to capture the local subtle motion around each joint as the pivotal joint-centered visual information. Compared to the pure skeleton-based baseline, this hybrid scheme effectively boosts performance, while keeping the computational and memory overheads low. Experiments on the NTU RGB+D, NTU RGB+D 120, and the Kinetics-Skeleton dataset demonstrate clear accuracy improvements attained by the proposed method over the state-of-the-art skeleton-based methods.

Related Material


[pdf]
[bibtex]
@InProceedings{Cai_2021_WACV, author = {Cai, Jinmiao and Jiang, Nianjuan and Han, Xiaoguang and Jia, Kui and Lu, Jiangbo}, title = {JOLO-GCN: Mining Joint-Centered Light-Weight Information for Skeleton-Based Action Recognition}, booktitle = {Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)}, month = {January}, year = {2021}, pages = {2735-2744} }