PIDiffu: Pixel-Aligned Diffusion Model for High-Fidelity Clothed Human Reconstruction

Jungeun Lee, Sanghun Kim, Hansol Lee, Tserendorj Adiya, Hwasup Lim; Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), 2024, pp. 5172-5181

Abstract


This paper presents the Pixel-aligned Diffusion Model (PIDiffu), a new framework for reconstructing high-fidelity clothed 3D human models from a single image. While existing PIFu variants have made significant advances using more complicated 2D and 3D feature extractions, these methods still suffer from floating artifacts and body part duplication due to their reliance on point-wise occupancy field estimations. PIDiffu employs a diffusion-based strategy for line-wise estimation along the ray direction, conditioned by pixel-aligned features with a guided attention. This approach improves the local details and structural accuracy of the reconstructed body shape and is robust to unfamiliar and complex image features. Moreover, PIDiffu can be easily integrated with existing PIFu-based methods to leverage their advantages. The paper demonstrates that PIDiffu outperforms state-of-the-art methods that do not rely on parametric 3D body models. Especially, our method is superior in handling 'in-the-wild' images, such as those with complex patterned clothes unseen in the training data.

Related Material


[pdf]
[bibtex]
@InProceedings{Lee_2024_WACV, author = {Lee, Jungeun and Kim, Sanghun and Lee, Hansol and Adiya, Tserendorj and Lim, Hwasup}, title = {PIDiffu: Pixel-Aligned Diffusion Model for High-Fidelity Clothed Human Reconstruction}, booktitle = {Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)}, month = {January}, year = {2024}, pages = {5172-5181} }