-
[pdf]
[bibtex]@InProceedings{Lu_2026_CVPR, author = {Lu, Xuanchen and Cao, Ang and Feng, Chao and Owens, Andrew}, title = {Generative Point Tracking and Forecasting}, booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)}, month = {June}, year = {2026}, pages = {28167-28178} }
Generative Point Tracking and Forecasting
Abstract
Motion forecasting predicts where points will move in the future, while motion tracking predicts where they are in the present. Despite these similarities, existing approaches to the two problems are quite different. In this paper, we propose a unified model that can address both tasks. We train a causal, video-conditioned flow matching model to predict point positions. The resulting model can easily toggle between point tracking to forecasting by changing its visual signal. Despite our model's simplicity, we find that it outperforms prior work in point forecasting and obtains performance that is competitive with the state-of-the-art on the TAP-Vid benchmark.
Related Material

