SeeU: Seeing the Unseen World via 4D Dynamics-aware Generation

Yu Yuan, Tharindu Wickremasinghe, Zeeshan Nadir, Xijun Wang, Yiheng Chi, Stanley H. Chan; Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2026, pp. 11150-11162

Abstract


Images and videos are discrete 2D projections of the 4D world (3D space + time). Most visual understanding, prediction, and generation operate directly on 2D observations, leading to suboptimal performance. We propose SeeU, a novel approach that learns the continuous 4D dynamics and generate the unseen visual contents. The principle behind SeeU is a new 2D\to4D\to2D learning framework. SeeU first reconstructs the 4D world from sparse and monocular 2D frames (2D\to4D). It then learns the continuous 4D dynamics on a low-rank representation and physical constraints (discrete 4D\tocontinuous 4D). Finally, SeeU rolls the world forward in time, re-projects it back to 2D at sampled times and viewpoints, and generates unseen regions based on spatial-temporal context awareness (4D\to2D). By modeling dynamics in 4D, SeeU achieves continuous and physically-consistent novel visual generation, demonstrating strong potentials in multiple tasks including unseen temporal generation, unseen spatial generation, and video editing. All data and code will be public at \href https://yuyuanspace.com/SeeU/ here .

Related Material


[pdf] [supp] [arXiv]
[bibtex]
@InProceedings{Yuan_2026_CVPR, author = {Yuan, Yu and Wickremasinghe, Tharindu and Nadir, Zeeshan and Wang, Xijun and Chi, Yiheng and Chan, Stanley H.}, title = {SeeU: Seeing the Unseen World via 4D Dynamics-aware Generation}, booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)}, month = {June}, year = {2026}, pages = {11150-11162} }