Context-Based Interpretable Spatio-Temporal Graph Convolutional Network for Human Motion Forecasting

Edgar Medina, Leyong Loh, Namrata Gurung, Kyung Hun Oh, Niels Heller; Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), 2024, pp. 3232-3241

Abstract


Human motion prediction is still an open problem extremely important for autonomous driving and safety applications. Due to the complex spatiotemporal relation of motion sequences, this remains a challenging problem not only for movement prediction but also to perform a preliminary interpretation of the joint connections. In this work, we present a Context-based Interpretable Spatio-Temporal Graph Convolutional Network (CIST-GCN), as an efficient 3D human pose forecasting model based on GCNs that encompasses specific layers, aiding model interpretability and providing information that might be useful when analyzing motion distribution and body behavior. Our architecture extracts meaningful information from pose sequences, aggregates displacements and accelerations into the input model, and finally predicts the output displacements. Extensive experiments on Human 3.6M, AMASS, 3DPW, and ExPI datasets demonstrate that CIST-GCN outperforms previous methods in human motion prediction and robustness. Since the idea of enhancing interpretability for motion prediction has its merits, we showcase experiments towards it and provide preliminary evaluations of such insights here.

Related Material


[pdf] [supp]
[bibtex]
@InProceedings{Medina_2024_WACV, author = {Medina, Edgar and Loh, Leyong and Gurung, Namrata and Oh, Kyung Hun and Heller, Niels}, title = {Context-Based Interpretable Spatio-Temporal Graph Convolutional Network for Human Motion Forecasting}, booktitle = {Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)}, month = {January}, year = {2024}, pages = {3232-3241} }