Physics-guided Shape-from-Template: Monocular Video Perception through Neural Surrogate Models

David Stotko, Nils Wandel, Reinhard Klein; Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024, pp. 11895-11904

Abstract


3D reconstruction of dynamic scenes is a long-standing problem in computer graphics and increasingly difficult the less information is available. Shape-from-Template (SfT) methods aim to reconstruct a template-based geometry from RGB images or video sequences often leveraging just a single monocular camera without depth information such as regular smartphone recordings. Unfortunately existing reconstruction methods are either unphysical and noisy or slow in optimization. To solve this problem we propose a novel SfT reconstruction algorithm for cloth using a pre-trained neural surrogate model that is fast to evaluate stable and produces smooth reconstructions due to a regularizing physics simulation. Differentiable rendering of the simulated mesh enables pixel-wise comparisons between the reconstruction and a target video sequence that can be used for a gradient-based optimization procedure to extract not only shape information but also physical parameters such as stretching shearing or bending stiffness of the cloth. This allows to retain a precise stable and smooth reconstructed geometry while reducing the runtime by a factor of 400-500 compared to ?-SfT a state-of-the-art physics-based SfT approach.

Related Material


[pdf] [supp]
[bibtex]
@InProceedings{Stotko_2024_CVPR, author = {Stotko, David and Wandel, Nils and Klein, Reinhard}, title = {Physics-guided Shape-from-Template: Monocular Video Perception through Neural Surrogate Models}, booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)}, month = {June}, year = {2024}, pages = {11895-11904} }