-
[pdf]
[supp]
[bibtex]@InProceedings{Gao_2024_CVPR, author = {Gao, Xiangjun and Li, Xiaoyu and Zhang, Chaopeng and Zhang, Qi and Cao, Yanpei and Shan, Ying and Quan, Long}, title = {ConTex-Human: Free-View Rendering of Human from a Single Image with Texture-Consistent Synthesis}, booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)}, month = {June}, year = {2024}, pages = {10084-10094} }
ConTex-Human: Free-View Rendering of Human from a Single Image with Texture-Consistent Synthesis
Abstract
In this work we propose a method to address the challenge of rendering a 3D human from a single image in a free-view manner. Some existing approaches could achieve this by using generalizable pixel-aligned implicit fields to reconstruct a textured mesh of a human or by employing a 2D diffusion model as guidance with the Score Distillation Sampling (SDS) method to lift the 2D image into 3D space. However a generalizable implicit field often results in an over-smooth texture field while the SDS method tends to lead to a texture-inconsistent novel view with the input image. In this paper we introduce a texture-consistent back view synthesis method that could transfer the reference image content to the back view through depth-guided mutual self-attention. With this method we could achieve high-fidelity and texture-consistent human rendering from a single image. Moreover to alleviate the color distortion that occurs in the side region we propose a visibility-aware patch consistency regularization combined with the synthesized back view texture. Experiments conducted on both real and synthetic data demonstrate the effectiveness of our method and show that our approach outperforms previous baseline methods.
Related Material