Prior-aware Dynamic Temporal Modeling Framework for Sequential 3D Hand Pose Estimation

Ren, Pengfei; Wang, Jingyu; Sun, Haifeng; Qi, Qi; Liu, Xingyu; Zhang, Menghao; Zhang, Lei; Wang, Jing; Liao, Jianxin

Pengfei Ren, Jingyu Wang, Haifeng Sun, Qi Qi, Xingyu Liu, Menghao Zhang, Lei Zhang, Jing Wang, Jianxin Liao; Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2025, pp. 6476-6487

Abstract

3D hand pose estimation plays a critical role in various human-computer interaction tasks. Single-frame 3D hand pose estimation methods have poor temporal smoothness and are easily affected by self-occlusion, which severely impacts their practical applicability. Traditional joint-based sequential pose estimation methods primarily focus on the human body and struggle to handle the complex hand structure, high degrees of freedom in hand motion, and rapidly changing hand motion trends. To address these challenges, we propose a prior-aware dynamic temporal modeling framework for sequential 3D hand pose estimation. We introduce a flexible memory mechanism to model hand prior information, which alleviates the scale and depth ambiguity in single-frame hand pose estimation. Additionally, we propose a dynamic temporal convolution module that adjusts the receptive field size and feature aggregation weights based on the motion information at each moment, effectively capturing rapid motion trends. By decoupling dynamic temporal modeling at the joint and hand levels, our method captures both subtle short-term variations and long-term motion trends, significantly improving the smoothness and accuracy of hand pose estimation. Experiments on four public datasets demonstrate that our method achieves the state-of-the-art results in terms of hand pose estimation accuracy and temporal smoothness.

Related Material

[pdf] [supp]

[bibtex]

@InProceedings{Ren_2025_ICCV, author = {Ren, Pengfei and Wang, Jingyu and Sun, Haifeng and Qi, Qi and Liu, Xingyu and Zhang, Menghao and Zhang, Lei and Wang, Jing and Liao, Jianxin}, title = {Prior-aware Dynamic Temporal Modeling Framework for Sequential 3D Hand Pose Estimation}, booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)}, month = {October}, year = {2025}, pages = {6476-6487} }