We show a comparison of reconstruction results of directly extending MagViTv2 to 8× temporal compression against our method ProMAG at 8× with our progressive growing approach. We also compare with Cosmos-CV-8×. ProMAG at 8× temporal compression has much more sharper reconstructions and not not contain blurriness observed in the reconstruction results of Cosmos-CV at 8× temporal compression at 16 channel latent space (zdim=16). Similarly, ProMAG at 8× temporal compression has much more accurate reconstructions and not not contain artifacts observed directly extending MagViTv2 to 8× temporal compression at 8 channel latent space (zdim=8).
Ground-Truth
MagViTv2-8× (zdim=16)
Cosmos-CV-8× (zdim=16)
ProMAG-8× (zdim=16)
Ground-Truth
MagViTv2-8× (zdim=16)
Cosmos-CV-8× (zdim=16)
ProMAG-8× (zdim=16)
Ground-Truth
MagViTv2-8× (zdim=16)
Cosmos-CV-8× (zdim=16)
ProMAG-8× (zdim=16)
Ground-Truth
MagViTv2-8× (zdim=16)
Cosmos-CV-8× (zdim=16)
ProMAG-8× (zdim=16)
Ground-Truth
MagViTv2-8× (zdim=16)
Cosmos-CV-8× (zdim=16)
ProMAG-8× (zdim=16)
Ground-Truth
MagViTv2-8× (zdim=16)
Cosmos-CV-8× (zdim=16)
ProMAG-8× (zdim=16)
Ground-Truth
MagViTv2-8× (zdim=16)
Cosmos-CV-8× (zdim=16)
ProMAG-8× (zdim=16)
Ground-Truth
MagViTv2-8× (zdim=8)
ProMAG-8× (zdim=8)
Ground-Truth
MagViTv2-8× (zdim=8)
ProMAG-8× (zdim=8)
Ground-Truth
MagViTv2-8× (zdim=8)
ProMAG-8× (zdim=8)
Ground-Truth
MagViTv2-8× (zdim=8)
ProMAG-8× (zdim=8)
Ground-Truth
MagViTv2-8× (zdim=8)
ProMAG-8× (zdim=8)
Ground-Truth
MagViTv2-8× (zdim=8)
ProMAG-8× (zdim=8)
Ground-Truth
MagViTv2-8× (zdim=8)
ProMAG-8× (zdim=8)