2D/3D Pose Estimation and Action Recognition Using Multitask Deep Learning

Diogo C. Luvizon, David Picard, Hedi Tabia; The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018, pp. 5137-5146

Abstract


Action recognition and human pose estimation are closely related but both problems are generally handled as distinct tasks in the literature. In this work, we propose a multitask framework for jointly 2D and 3D pose estimation from still images and human action recognition from video sequences. We show that a single architecture can be used to solve the two problems in an efficient way and still achieves state-of-the-art results. Additionally, we demonstrate that optimization from end-to-end leads to significantly higher accuracy than separated learning. The proposed architecture can be trained with data from different categories simultaneously in a seamlessly way. The reported results on four datasets (MPII, Human3.6M, Penn Action and NTU) demonstrate the effectiveness of our method on the targeted tasks.

Related Material


[pdf] [Supp] [arXiv]
[bibtex]
@InProceedings{Luvizon_2018_CVPR,
author = {Luvizon, Diogo C. and Picard, David and Tabia, Hedi},
title = {2D/3D Pose Estimation and Action Recognition Using Multitask Deep Learning},
booktitle = {The IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
month = {June},
year = {2018}
}