-
[pdf]
[supp]
[arXiv]
[bibtex]@InProceedings{Wan_2024_CVPR, author = {Wan, Ziyu and Paschalidou, Despoina and Huang, Ian and Liu, Hongyu and Shen, Bokui and Xiang, Xiaoyu and Liao, Jing and Guibas, Leonidas}, title = {CAD: Photorealistic 3D Generation via Adversarial Distillation}, booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)}, month = {June}, year = {2024}, pages = {10194-10207} }
CAD: Photorealistic 3D Generation via Adversarial Distillation
Abstract
The increased demand for 3D data in AR/VR robotics and gaming applications gave rise to powerful generative pipelines capable of synthesizing high-quality 3D objects. Most of these models rely on the Score Distillation Sampling (SDS) algorithm to optimize a 3D representation such that the rendered image maintains a high likelihood as evaluated by a pre-trained diffusion model. However this distillation process involves finding a correct mode in the high-dimensional and large-variance distribution produced by the diffusion model. This task is challenging and often leads to issues such as over-saturation over-smoothing and Janus-like artifacts in the 3D generation. In this paper we propose a novel learning paradigm for 3D synthesis that utilizes pre-trained diffusion models. Instead of focusing on mode-seeking our method directly models the distribution discrepancy between multi-view renderings and diffusion priors in an adversarial manner which unlocks the generation of high-fidelity and photorealistic 3D content conditioned on a single image and prompt. Moreover by harnessing the latent space of GANs and expressive diffusion model priors our method enables a wide variety of 3D applications including single-view reconstruction high diversity generation and continuous 3D interpolation in open domain. Our experiments demonstrate the superiority of our pipeline compared to previous works in terms of generation quality and diversity.
Related Material