3D Ego-Pose Estimation via Imitation Learning

Ye Yuan, Kris Kitani; Proceedings of the European Conference on Computer Vision (ECCV), 2018, pp. 735-750


Ego-pose estimation, i.e., estimating a person's 3D pose with a single wearable camera, has many potential applications in activity monitoring. For these applications, both accurate and physically plausible estimates are desired, with the latter often overlooked by existing work. Traditional computer vision-based approaches using temporal smoothing only take into account the kinematics of the motion without considering the physics that underlies the dynamics of motion, which leads to pose estimates that are physically invalid. Motivated by this, we propose a novel control-based approach to model human motion with physics simulation and use imitation learning to learn a video-conditioned control policy for ego-pose estimation. Our imitation learning framework allows us to perform domain adaption to transfer our policy trained on simulation data to real-world data. Our experiments with real egocentric videos show that our method can estimate both accurate and physically plausible 3D ego-pose sequences without observing the cameras wearer's body.

Related Material

author = {Yuan, Ye and Kitani, Kris},
title = {3D Ego-Pose Estimation via Imitation Learning},
booktitle = {Proceedings of the European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}