Single-Frame Indexing for 3D Hand Pose Estimation

Cassandra Carley, Carlo Tomasi; Proceedings of the IEEE International Conference on Computer Vision (ICCV) Workshops, 2015, pp. 101-109


Hand pose estimation from 3D sensor data matches a point cloud to a hand model, and has broad applications from gestural interfaces to scene understanding. We propose a novel scheme to index into a database of precomputed hand poses to initialize the match. Our index describes 2D hand silhouettes, which can be computed from either depth maps or standard video, in the form of simple yet expressive signatures. We compare signatures to each other through a new variant of the Earth Mover's Distance that makes small distances in feature space correlate highly with those in pose space. We present a new technique that uses a depth sensor and a sensor glove to create databases of real images and ground-truth poses for both training and testing. We show state-of-the-art accuracy and speed for both gesture classification and joint-pose regression, even when comparing our 2D single-frame method with those that employ RGB-D features or multi-sensor inputs and report quantitative results.

Related Material

author = {Carley, Cassandra and Tomasi, Carlo},
title = {Single-Frame Indexing for 3D Hand Pose Estimation},
booktitle = {Proceedings of the IEEE International Conference on Computer Vision (ICCV) Workshops},
month = {December},
year = {2015}