Make-It-3D: High-fidelity 3D Creation from A Single Image with Diffusion Prior

Junshu Tang, Tengfei Wang, Bo Zhang, Ting Zhang, Ran Yi, Lizhuang Ma, Dong Chen; Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2023, pp. 22819-22829

Abstract


In this work, we investigate the problem of creating high-fidelity 3D content from only a single image. This is inherently challenging: it essentially involves estimating the underlying 3D geometry while hallucinating unseen textures. To address this challenge, we leverage prior knowledge in a well-trained 2D diffusion model to serve as a 3D-aware supervision for 3D creation. Our proposed method, Make-It-3D, employs a two-stage optimization pipeline: the first stage optimizes a neural radiance field with constraints from the reference image and diffusion prior; the second stage builds textured point clouds from the coarse model and further enhances the textures with diffusion prior leveraging the availability of high-quality textures from the reference image. Extensive experiments show that our method achieves a clear improvement over previous works, displaying faithful reconstruction and impressive visual quality. Our method presents the first attempt to achieve high-quality 3D creation from a single image for general objects, and enables various applications such as text-to-3D creation and texture editing.

Related Material


[pdf] [supp]
[bibtex]
@InProceedings{Tang_2023_ICCV, author = {Tang, Junshu and Wang, Tengfei and Zhang, Bo and Zhang, Ting and Yi, Ran and Ma, Lizhuang and Chen, Dong}, title = {Make-It-3D: High-fidelity 3D Creation from A Single Image with Diffusion Prior}, booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)}, month = {October}, year = {2023}, pages = {22819-22829} }