ATT3D: Amortized Text-to-3D Object Synthesis

Lorraine, Jonathan; Xie, Kevin; Zeng, Xiaohui; Lin, Chen-Hsuan; Takikawa, Towaki; Sharp, Nicholas; Lin, Tsung-Yi; Liu, Ming-Yu; Fidler, Sanja; Lucas, James

Jonathan Lorraine, Kevin Xie, Xiaohui Zeng, Chen-Hsuan Lin, Towaki Takikawa, Nicholas Sharp, Tsung-Yi Lin, Ming-Yu Liu, Sanja Fidler, James Lucas; Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2023, pp. 17946-17956

Abstract

Text-to-3D modelling has seen exciting progress by combining generative text-to-image models with image-to-3D methods like Neural Radiance Fields. DreamFusion recently achieved high-quality results but requires a lengthy, per-prompt optimization to create 3D objects. To address this, we amortize optimization over text prompts by training on many prompts simultaneously with a unified model instead of separately. With this, we share computation across a prompt set, training in less time than per-prompt optimization. Our framework, Amortized Text-to-3D (ATT3D), enables knowledge sharing between prompts to generalize to unseen setups and smooth interpolations between text for novel assets and simple animations.

Related Material

[pdf] [supp] [arXiv]

[bibtex]

@InProceedings{Lorraine_2023_ICCV, author = {Lorraine, Jonathan and Xie, Kevin and Zeng, Xiaohui and Lin, Chen-Hsuan and Takikawa, Towaki and Sharp, Nicholas and Lin, Tsung-Yi and Liu, Ming-Yu and Fidler, Sanja and Lucas, James}, title = {ATT3D: Amortized Text-to-3D Object Synthesis}, booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)}, month = {October}, year = {2023}, pages = {17946-17956} }