Sherpa3D: Boosting High-Fidelity Text-to-3D Generation via Coarse 3D Prior

Liu, Fangfu; Wu, Diankun; Wei, Yi; Rao, Yongming; Duan, Yueqi

Fangfu Liu, Diankun Wu, Yi Wei, Yongming Rao, Yueqi Duan; Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024, pp. 20763-20774

Abstract

Recently 3D content creation from text prompts has demonstrated remarkable progress by utilizing 2D and 3D diffusion models. While 3D diffusion models ensure great multi-view consistency their ability to generate high-quality and diverse 3D assets is hindered by the limited 3D data. In contrast 2D diffusion models find a distillation approach that achieves excellent generalization and rich details without any 3D data. However 2D lifting methods suffer from inherent view-agnostic ambiguity thereby leading to serious multi-face Janus issues where text prompts fail to provide sufficient guidance to learn coherent 3D results. Instead of retraining a costly viewpoint-aware model we study how to fully exploit easily accessible coarse 3D knowledge to enhance the prompts and guide 2D lifting optimization for refinement. In this paper we propose Sherpa3D a new text-to-3D framework that achieves high-fidelity generalizability and geometric consistency simultaneously. Specifically we design a pair of guiding strategies derived from the coarse 3D prior generated by the 3D diffusion model: a structural guidance for geometric fidelity and a semantic guidance for 3D coherence. Employing the two types of guidance the 2D diffusion model enriches the 3D content with diversified and high-quality results. Extensive experiments show the superiority of our Sherpa3D over the state-of-the-art text-to-3D methods in terms of quality and 3D consistency.

Related Material

[pdf] [supp] [arXiv]

[bibtex]

@InProceedings{Liu_2024_CVPR, author = {Liu, Fangfu and Wu, Diankun and Wei, Yi and Rao, Yongming and Duan, Yueqi}, title = {Sherpa3D: Boosting High-Fidelity Text-to-3D Generation via Coarse 3D Prior}, booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)}, month = {June}, year = {2024}, pages = {20763-20774} }