Image-to-Voxel Model Translation with Conditional Adversarial Networks

Vladimir A. Knyaz, Vladimir V. Kniaz, Fabio Remondino; Proceedings of the European Conference on Computer Vision (ECCV) Workshops, 2018, pp. 0-0

Abstract


We present a single-view voxel model prediction method that uses generative adversarial networks. Our method utilizes correspondences between 2D silhouettes and slices of a camera frustum to predict a voxel model of a scene with multiple object instances. We exploit pyramid shaped voxel and a generator network with skip connections between 2D and 3D feature maps. We collected two datasets VoxelCity and VoxelHome to train our framework with 36,416 images of 28 scenes with ground-truth 3D models, depth maps, and 6D object poses. We made the datasets publicly available4. We evaluate our framework on 3D shape datasets to show that it delivers robust 3D scene reconstruction results that compete with and surpass state-of-the-art in a scene reconstruction with multiple non-rigid objects.

Related Material


[pdf]
[bibtex]
@InProceedings{Knyaz_2018_ECCV_Workshops,
author = {Knyaz, Vladimir A. and Kniaz, Vladimir V. and Remondino, Fabio},
title = {Image-to-Voxel Model Translation with Conditional Adversarial Networks},
booktitle = {Proceedings of the European Conference on Computer Vision (ECCV) Workshops},
month = {September},
year = {2018}
}