DeepVoxels++: Enhancing the Fidelity of Novel View Synthesis from 3D Voxel Embeddings

Tong He, John Collomosse, Hailin Jin, Stefano Soatto; Proceedings of the Asian Conference on Computer Vision (ACCV), 2020


We present a novel view synthesis method based upon latent voxel embeddings of an object, which encode both shape and appearance information and are learned without explicit 3D occupancy supervision. Our method uses an encoder-decoder architecture to learn such deep volumetric representations from a set of images taken at multiple viewpoints. Compared with DeepVoxels, our DeepVoxels++ applies a series of enhancements: a) a patch-based image feature extraction and neural rendering scheme that learns local shape and texture patterns, and enables neural rendering at high resolution; b) learned view-dependent feature transformation kernels to explicitly model perspective transformations induced by viewpoint changes; c) a recurrent-concurrent aggregation technique to alleviate single-view update bias of the voxel embeddings recurrent learning process. Combined with d) a simple yet effective implementation trick of frustum representation sufficient sampling, we achieve improved visual quality over the prior deep voxel-based methods (33% SSIM error reduction and 22% PSNR improvement) on 360-degree novel-view synthesis benchmarks.

Related Material

[pdf] [supp] [code]
@InProceedings{He_2020_ACCV, author = {He, Tong and Collomosse, John and Jin, Hailin and Soatto, Stefano}, title = {DeepVoxels++: Enhancing the Fidelity of Novel View Synthesis from 3D Voxel Embeddings}, booktitle = {Proceedings of the Asian Conference on Computer Vision (ACCV)}, month = {November}, year = {2020} }