Vox-E: Text-guided Voxel Editing of 3D Objects

Supplementary Material

 

Please click on the sections below to view our interactive results, comparisons and ablations:

 


Ablations

We ablate the key components in our framework by comparing against an image-space L2 loss (col 2) and showing our result before (col 3) and after (col 4) refinement.
Due to space-constraints, we don't show an image-space L1 loss (which performs similarly to an image-space L2 loss), or volumetric L1 and L2 losses (which also underperform our correlation-based volumetric loss).
However, in the accompanying PDF we provide a quantitative comparison also to these additional image-space and volumetric losses.


  Input Image space L2 Ours unrefined Ours refined

A kangaroo wearing a christmas sweater



A dog wearing big sunglasses



A cat wearing a birthday party hat