EventHands: Real-Time Neural 3D Hand Pose Estimation From an Event Stream

Viktor Rudnev, Vladislav Golyanik, Jiayi Wang, Hans-Peter Seidel, Franziska Mueller, Mohamed Elgharib, Christian Theobalt; Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2021, pp. 12385-12395

Abstract


3D hand pose estimation from monocular videos is a long-standing and challenging problem, which is now seeing a strong upturn. In this work, we address it for the first time using a single event camera, i.e., an asynchronous vision sensor reacting on brightness changes. Our EventHands approach has characteristics previously not demonstrated with a single RGB or depth camera such as high temporal resolution at low data throughputs and real-time performance at 1000 Hz. Due to the different data modality of event cameras compared to classical cameras, existing methods cannot be directly applied to and re-trained for event streams. We thus design a new neural approach which accepts a new event stream representation suitable for learning, which is trained on newly-generated synthetic event streams and can generalise to real data. Experiments show that EventHands outperforms recent monocular methods using a colour (or depth) camera in terms of accuracy and its ability to capture hand motions of unprecedented speed. Our method, the event stream simulator and the dataset are publicly available (see https://gvv.mpi-inf.mpg.de/projects/EventHands/).

Related Material


[pdf] [supp] [arXiv]
[bibtex]
@InProceedings{Rudnev_2021_ICCV, author = {Rudnev, Viktor and Golyanik, Vladislav and Wang, Jiayi and Seidel, Hans-Peter and Mueller, Franziska and Elgharib, Mohamed and Theobalt, Christian}, title = {EventHands: Real-Time Neural 3D Hand Pose Estimation From an Event Stream}, booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)}, month = {October}, year = {2021}, pages = {12385-12395} }