Progressive Hypothesis Transformer for 3D Human Mesh Recovery

Huang-Ru Liao, Jen-Chun Lin, Chun-Yi Lee; Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), 2024, pp. 6323-6332


Recent advancements in Transformer-based human mesh reconstruction (HMR) are commendable. However, these models often lift 2D images directly to 3D vertices without explicit intermediate guidance. In addition, the global attention mechanism tends to spread attention across larger body areas and even unrelated background regions during human mesh estimation, rather than focusing on critical local regions such as human body joints. This tendency leads to inaccurate and unrealistic results for complex activities. To address these challenges, we introduce the Progressive Hypotheses Transformer, which employs 2D and 3D pose predictions to progressively guide our model. Moreover, we propose a mechanism that generates multiple plausible hypotheses for both 2D and 3D poses to mitigate potential inaccuracies arising from intermediate pose estimations. Our model also incorporates inter-intra attention to capture correlations between joints and hypotheses. Experimental results demonstrate that our method surpasses existing imagebased approaches on Human3.6M [13] and 3DPW [36] with fewer parameters and relatively lower computational costs.

Related Material

[pdf] [supp]
@InProceedings{Liao_2024_WACV, author = {Liao, Huang-Ru and Lin, Jen-Chun and Lee, Chun-Yi}, title = {Progressive Hypothesis Transformer for 3D Human Mesh Recovery}, booktitle = {Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)}, month = {January}, year = {2024}, pages = {6323-6332} }