Deformable Pose Traversal Convolution for 3D Action and Gesture Recognition

Junwu Weng, Mengyuan Liu, Xudong Jiang, Junsong Yuan; The European Conference on Computer Vision (ECCV), 2018, pp. 136-152


The representation of 3D pose plays a critical role for 3D body action and hand gesture recognition. Rather than directly representing the 3D pose using its joint locations, in this paper, we propose Deformable Pose Traversal Convolution which applies one-dimensional convolution to traverse the 3D pose to represent it. Instead of fixing the reception field when performing traversal convolution, it optimizes the convolutional kernel for each joint, by considering contextual joints with various weights. This deformable convolution can better utilize contextual joints for action and gesture recognition and is more robust to noisy joints. Moreover, by feeding the learned pose feature to a LSTM, we can perform end-to-end training which jointly optimizes 3D pose representation and temporal sequence recognition. Experiments on three benchmark datasets validate the competitive performance of our proposed method, as well as its efficiency and robustness to handle noisy pose.

Related Material

author = {Weng, Junwu and Liu, Mengyuan and Jiang, Xudong and Yuan, Junsong},
title = {Deformable Pose Traversal Convolution for 3D Action and Gesture Recognition},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}