-
[pdf]
[supp]
[bibtex]@InProceedings{Materzynska_2024_ACCV, author = {Materzy\'nska, Joanna and Sivic, Josef and Shechtman, Eli and Torralba, Antonio and Zhang, Richard and Russell, Bryan}, title = {NewMove: Customizing text-to-video models with novel motions}, booktitle = {Proceedings of the Asian Conference on Computer Vision (ACCV)}, month = {December}, year = {2024}, pages = {1634-1651} }
NewMove: Customizing text-to-video models with novel motions
Abstract
We introduce an approach for augmenting text-to-video generation models with novel motions, extending their capabilities beyond the motions contained in the original training data. By leveraging a few video samples demonstrating specific movements as input, our method learns and generalizes the input motion patterns for diverse, text-specified scenarios. Our method finetunes an existing text-to-video model to learn a novel mapping between the depicted motion in the input examples to a new unique token. To avoid overfitting to the new custom motion, we introduce an approach for regularization over videos. Leveraging the motion priors in a pretrained model, our method can learn a generalized motion pattern, that can be invoked with novel videos featuring multiple people doing the custom motion, or using the motion in combination with other motions. To validate our method, we quantitatively evaluate the learned custom motion and perform a systematic ablation study. We show that our method significantly outperforms prior appearance-based customization approaches when extended to the motion customization task.
Related Material