DrawInAir: A Lightweight Gestural Interface Based on Fingertip Regression

Garg, Gaurav; Hegde, Srinidhi; Perla, Ramakrishna; Jain, Varun; Vig, Lovekesh; Hebbalaguppe, Ramya

Gaurav Garg, Srinidhi Hegde, Ramakrishna Perla, Varun Jain, Lovekesh Vig, Ramya Hebbalaguppe; Proceedings of the European Conference on Computer Vision (ECCV) Workshops, 2018, pp. 0-0

Abstract

Hand gestures form a natural way of interaction on HeadMounted Devices (HMDs) and smartphones. HMDs such as the Microsoft HoloLens and ARCore/ARKit platform enabled smartphones are expensive and are equipped with powerful processors and sensors such as multiple cameras, depth and IR sensors to process hand gestures. To enable mass market reach via inexpensive Augmented Reality (AR) headsets without built-in depth or IR sensors, we propose a real-time, in-air gestural framework that works on monocular RGB input, termed, DrawInAir. DrawInAir uses fingertip for writing in air analogous to a pen on paper. The major challenge in training egocentric gesture recognition models is in obtaining sufficient labeled data for end-to-end learning. Thus, we design a cascade of networks, consisting of a CNN with differentiable spatial to numerical transform (DSNT) layer, for fingertip regression, followed by a Bidirectional Long Short-Term Memory (Bi-LSTM), for a real-time pointing hand gesture classification. We highlight how a model, that is separately trained to regress fingertip in conjunction with a classifier trained on limited classification data, would perform better over end-to-end models. We also propose a dataset of 10 egocentric pointing gestures designed for AR applications for testing our model. We show that the framework takes 1.73s to run end-to-end and has a low memory footprint of 14MB while achieving an accuracy of 88.0% on egocentric video dataset.

Related Material

[pdf]

[bibtex]

@InProceedings{Garg_2018_ECCV_Workshops,
author = {Garg, Gaurav and Hegde, Srinidhi and Perla, Ramakrishna and Jain, Varun and Vig, Lovekesh and Hebbalaguppe, Ramya},
title = {DrawInAir: A Lightweight Gestural Interface Based on Fingertip Regression},
booktitle = {Proceedings of the European Conference on Computer Vision (ECCV) Workshops},
month = {September},
year = {2018}
}