REVIVE3D
Refinement via Encoded Voluminous Inflated Prior for Volume Enhancement

A main figure illustrating the key concept of our work.

Abstract

Recent generative models have shown strong performance in generating diverse 3D assets from 2D images, a fundamental research topic in computer vision and graphics. However, these models still struggle to generate voluminous 3D assets when the input is a flat image that provides limited 3D cues. We introduce REVIVE 3D, a two-stage, plug-and-play pipeline for generating voluminous 3D assets from flat images. In Stage 1, we construct an Inflated Prior by inflating the foreground silhouette to recover global volume and superimposing part-aware details to capture local structure. In Stage 2, 3D Latent Refinement injects Gaussian noise into the Inflated Prior's latent and then denoises it, guided by the prior's geometric cues and the backbone's pretrained 3D knowledge. By initializing the process with the encoded latent of a source mesh instead of the prior, the framework also supports 3D editing conditioned on an edited image. To quantify volume and surface flatness, we propose Compactness and Normal Anisotropy. We validate Compactness and Normal Anisotropy through a user study, showing that these metrics align with human perception of volume and quality. We show that REVIVE 3D achieves state-of-the-art performance on a challenging flat image dataset, based on extensive qualitative and quantitative evaluations.

Pipeline

Overview of our method. Stage 1 generates the Inflated Prior. We create a Base 3D from the Silhouette Mask and Detail 3D from Segmentation Masks, then combine them via superimposing. Stage 2 refines the Inflated Prior by encoding the mesh, injecting noise, denoising it with the image condition, and decoding the result into the Refined 3D mesh.

Methodology Pipeline Diagram

Comparison Sliders

Drag the slider on each image to compare our results with the baseline method.

Input for DrawingSpinUP

Input Image

Comparison: DrawingSpinUP vs. Ours

Input for Direct3D

Input Image

Comparison: Direct3D vs. Ours

Input for Hunyuan3D-2.1

Input Image

Comparison: Hunyuan3D-2.1 vs. Ours

Input for Hunyuan3D-Omni

Input Image

Comparison: Hunyuan3D-Omni vs. Ours

Input for Trellis

Input Image

Comparison: Trellis vs. Ours

Interactive 3D Mesh Gallery

Explore our reconstructed 3D models interactively. Click and drag to rotate, scroll to zoom.

Input for Astronaut

Input Image1

Generated 3D Mesh1

Input for Horse

Input Image2

Generated 3D Mesh2

Input for Neil Armstrong

Input Image3

Generated 3D Mesh3

Input for Model 4

Input Image4

Generated 3D Mesh4

Input for Model 5

Input Image5

Generated 3D Mesh5