SignPose: Sign Language Animation Through 3D Pose Lifting

Krishna, Shyam; P, Vijay Vignesh; J, Dinesh Babu

Shyam Krishna, Vijay Vignesh P, Dinesh Babu J; Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) Workshops, 2021, pp. 2640-2649

Abstract

Sign Language Generation (SLG) is a challenging task in computer animation as it involves capturing intricate hand gestures accurately, for several thousand signs in each sign language. Traditional methods require expensive equipment and considerable human involvement. In this paper, we provide a method to automate this process using only plain RGB images to generate sign poses for an avatar - the first of its kind for SLG. Current state of the art models for human 3D pose estimation do not perform satisfactorily in SLG due to the large difference between tasks. The datasets they are trained on contain only tasks like walking and playing sports, which involve significantly different types of motion compared to signing. Synthetic, manually created 3D animations are available for diverse tasks including sign language performance. Modern 2D pose estimation models which work on real world images are also robust enough to work on these animations accurately. Inspired by this, we formulate a novel method of leveraging animation data, using an intermediate 2D pose representation, to train an SLG animation model that works on real world sign language performance videos. To create the dataset for training, we extend an available animated dataset of signs in the Indian Sign Language (ISL) by permuting different hand and body motions. A novel quaternion based architecture is created to perform the task of lifting the 2D keypoints to 3D. The architecture is simplified to match the requirements of our task as well as to work with our smaller dataset size. We train a model, SignPose, using this architecture on the constructed dataset and demonstrate that it matches or outperforms current models for human pose reconstruction for the Sign Language Generation task. We will release both the dataset as well the model to the public to encourage further research in this field.

Related Material

[pdf]

[bibtex]

@InProceedings{Krishna_2021_ICCV, author = {Krishna, Shyam and P, Vijay Vignesh and J, Dinesh Babu}, title = {SignPose: Sign Language Animation Through 3D Pose Lifting}, booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) Workshops}, month = {October}, year = {2021}, pages = {2640-2649} }