PSHuman: Photorealistic Single-image 3D Human Reconstruction using Cross-Scale Multiview Diffusion and Explicit Remeshing

Li, Peng; Zheng, Wangguandong; Liu, Yuan; Yu, Tao; Li, Yangguang; Qi, Xingqun; Chi, Xiaowei; Xia, Siyu; Cao, Yan-Pei; Xue, Wei; Luo, Wenhan; Guo, Yike

Peng Li, Wangguandong Zheng, Yuan Liu, Tao Yu, Yangguang Li, Xingqun Qi, Xiaowei Chi, Siyu Xia, Yan-Pei Cao, Wei Xue, Wenhan Luo, Yike Guo; Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2025, pp. 16008-16018

Abstract

Photorealistic 3D human modeling is essential for various applications and has seen tremendous progress. However, existing methods for monocular full-body reconstruction, typically relying on front and/or predicted back view, still struggle with satisfactory performance due to the ill-posed nature of the problem and sophisticated self-occlusions. In this paper, we propose PSHuman, a novel framework that explicitly reconstructs human meshes utilizing priors from the multiview diffusion model. It is found that directly applying multiview diffusion on single-view human images leads to severe geometric distortions, especially on generated faces. To address it, we propose a cross-scale diffusion that models the joint probability distribution of global full-body shape and local facial characteristics, enabling identity-preserved novel-view generation without geometric distortion. Moreover, to enhance cross-view body shape consistency of varied human poses, we condition the generative model on parametric models (SMPL-X), which provide body priors and prevent unnatural views inconsistent with human anatomy. Leveraging the generated multiview normal and color images, we present SMPLX-initialized explicit human carving to recover realistic textured human meshes efficiently. Extensive experiments on CAPE and THuman2.1 demonstrate PSHuman's superiority in geometry details, texture fidelity, and generalization capability.

Related Material

[pdf] [supp] [arXiv]

[bibtex]

@InProceedings{Li_2025_CVPR, author = {Li, Peng and Zheng, Wangguandong and Liu, Yuan and Yu, Tao and Li, Yangguang and Qi, Xingqun and Chi, Xiaowei and Xia, Siyu and Cao, Yan-Pei and Xue, Wei and Luo, Wenhan and Guo, Yike}, title = {PSHuman: Photorealistic Single-image 3D Human Reconstruction using Cross-Scale Multiview Diffusion and Explicit Remeshing}, booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)}, month = {June}, year = {2025}, pages = {16008-16018} }