Deep Geometric Moments Promote Shape Consistency in Text-to-3D Generation

Utkarsh Nath, Rajeev Goel, Eun Som Jeon, Changhoon Kim, Kyle Min, Yezhou Yang, Yingzhen Yang, Pavan Turaga; Proceedings of the Winter Conference on Applications of Computer Vision (WACV), 2025, pp. 4331-4341

Abstract


To address the data scarcity associated with 3D assets 2D-lifting techniques such as Score Distillation Sampling (SDS) have become a widely adopted practice in text-to-3D generation pipelines. However the diffusion models used in these techniques are prone to viewpoint bias and thus lead to geometric inconsistencies such as the Janus problem. To counter this we introduce MT3D a text-to-3D generative model that leverages a high-fidelity 3D object to overcome viewpoint bias and explicitly infuse geometric understanding into the generation pipeline. Firstly we employ depth maps derived from a high-quality 3D model as control signals to guarantee that the generated 2D images preserve the fundamental shape and structure thereby reducing the inherent viewpoint bias. Next we utilize deep geometric moments to ensure geometric consistency in the 3D representation explicitly. By incorporating geometric details from a 3D asset MT3D enables the creation of diverse and geometrically consistent objects thereby improving the quality and usability of our 3D representations. Project page and code: https://moment-3d.github.io/

Related Material


[pdf] [supp] [arXiv]
[bibtex]
@InProceedings{Nath_2025_WACV, author = {Nath, Utkarsh and Goel, Rajeev and Jeon, Eun Som and Kim, Changhoon and Min, Kyle and Yang, Yezhou and Yang, Yingzhen and Turaga, Pavan}, title = {Deep Geometric Moments Promote Shape Consistency in Text-to-3D Generation}, booktitle = {Proceedings of the Winter Conference on Applications of Computer Vision (WACV)}, month = {February}, year = {2025}, pages = {4331-4341} }