UnityVideo: Unified Multi-Modal Multi-Task Learning
for Enhancing World-Aware Video Generation
Anonymous author
Supplementary Material
Figure: Overview of the UnityVideo Framework
⏳ It may take some time to load all videos. Thank you for your patience!
📹 Teaser Videos
JointGen
Estimator
ControGen
✨ Method Showcases
JointGen - Text to Video
Estimator - Video to Modality
ControGen - Modality to Video