- [pdf] [supp]
DDIT: Semantic Scene Completion via Deformable Deep Implicit Templates
Scene reconstructions are often incomplete due to occlusions and limited viewpoints. There have been efforts to use semantic information for scene completion. However, the completed shapes may be rough and imprecise since respective methods rely on 3D convolution and/or lack effective shape constraints. To overcome these limitations, we propose a semantic scene completion method based on deformable deep implicit templates (DDIT). Specifically, we complete each segmented instance in a scene by deforming a template with a latent code. Such a template is expressed by a deep implicit function in the canonical frame. It abstracts the shape prior of a category, and thus can provide constraints on the overall shape of an instance. Latent code controls the deformation of template to guarantee fine details of an instance. For code prediction, we design a neural network that leverages both intra- and inter-instance information. We also introduce an algorithm to transform instances between the world and canonical frames based on geometric constraints and a hierarchical tree. To further improve accuracy, we jointly optimize the latent code and transformation by enforcing the zero-valued isosurface constraint. In addition, we establish a new dataset to solve different problems of existing datasets. Experiments showed that our DDIT outperforms state-of-the-art approaches.