Qualitative comparison of 4D generation methods. We compare the 4D generation results of our method with DreamGaussian4D (DG4D) [38], SV4D [64], and 4Real-Video [53]. All methods take a monocular video as input. In addition, DG4D incorporates our accurate static Gaussian representation, whereas SV4D and 4Real-Video rely on freeze-time renderings. Our method takes the results of 4Real-Video as input and further enhances the novel view synthesis. Notably, none of the baseline methods produce high-quality 3D geometry, whereas ours does. Viewpoints are displayed at ±60 degrees relative to the original perspective (frontal view) used for generating the monocular video. We also show the rendering of the static mesh from the same viewpoint as 'static reference'. Among all methods, our approach achieves the most visually consistent and appealing results.