Estimating Human Pose with Flowing Puppets

Silvia Zuffi, Javier Romero, Cordelia Schmid, Michael J. Black; Proceedings of the IEEE International Conference on Computer Vision (ICCV), 2013, pp. 3312-3319

Abstract


We address the problem of upper-body human pose estimation in uncontrolled monocular video sequences, without manual initialization. Most current methods focus on isolated video frames and often fail to correctly localize arms and hands. Inferring pose over a video sequence is advantageous because poses of people in adjacent frames exhibit properties of smooth variation due to the nature of human and camera motion. To exploit this, previous methods have used prior knowledge about distinctive actions or generic temporal priors combined with static image likelihoods to track people in motion. Here we take a different approach based on a simple observation: Information about how a person moves from frame to frame is present in the optical flow field. We develop an approach for tracking articulated motions that "links" articulated shape models of people in adjacent frames through the dense optical flow. Key to this approach is a 2D shape model of the body that we use to compute how the body moves over time. The resulting "flowing puppets" provide a way of integrating image evidence across frames to improve pose inference. We apply our method on a challenging dataset of TV video sequences and show state-of-the-art performance.

Related Material


[pdf]
[bibtex]
@InProceedings{Zuffi_2013_ICCV,
author = {Zuffi, Silvia and Romero, Javier and Schmid, Cordelia and Black, Michael J.},
title = {Estimating Human Pose with Flowing Puppets},
booktitle = {Proceedings of the IEEE International Conference on Computer Vision (ICCV)},
month = {December},
year = {2013}
}