LIM: Large Interpolator Model for Dynamic Reconstruction

Remy Sabathier, Niloy J. Mitra, David Novotny; Proceedings of the Computer Vision and Pattern Recognition Conference (CVPR), 2025, pp. 6154-6164

Abstract


Reconstructing dynamic assets from video data is central to many in computer vision and graphics tasks. Existing 4D reconstruction approaches are limited by category-specific models or slow optimization-based methods. Inspired by the recent Large Reconstruction Model (LRM), we present the Large Interpolation Model (LIM), a transformer-based feed-forward solution, guided by a novel causal consistency loss, for interpolating implicit 3D representations across time. Given implicit 3D representations at times t_0 and t_1, LIM produces a deformed shape at any continuous time t\in[t_0,t_1] delivering high-quality interpolations in seconds (per frame).Furthermore, LIM allows explicit mesh tracking across time, producing a consistently uv-textured mesh sequence ready for integration into existing production pipelines. We also use LIM, in conjunction with a diffusion-based multiview generator, to produce dynamic 4D reconstructions from monocular videos. We evaluate LIM on various dynamic datasets, benchmarking against image-space interpolation methods (e.g., FiLM) and direct triplane linear interpolation, and demonstrate clear advantages. In summary, LIM is the first feed-forward model capable of high-speed tracked 4D asset reconstruction across diverse categories.

Related Material


[pdf] [supp] [arXiv]
[bibtex]
@InProceedings{Sabathier_2025_CVPR, author = {Sabathier, Remy and Mitra, Niloy J. and Novotny, David}, title = {LIM: Large Interpolator Model for Dynamic Reconstruction}, booktitle = {Proceedings of the Computer Vision and Pattern Recognition Conference (CVPR)}, month = {June}, year = {2025}, pages = {6154-6164} }