GoMAvatar: Efficient Animatable Human Modeling from Monocular Video Using Gaussians-on-Mesh

Jing Wen, Xiaoming Zhao, Zhongzheng Ren, Alexander G. Schwing, Shenlong Wang; Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024, pp. 2059-2069

Abstract


We introduce GoMAvatar a novel approach for real-time memory-efficient high-quality animatable human modeling. GoMAvatar takes as input a single monocular video to create a digital avatar capable of re-articulation in new poses and real-time rendering from novel viewpoints while seamlessly integrating with rasterization-based graphics pipelines. Central to our method is the Gaussians-on-Mesh (GoM) representation a hybrid 3D model combining rendering quality and speed of Gaussian splatting with geometry modeling and compatibility of deformable meshes. We assess GoMAvatar on ZJU-MoCap PeopleSnapshot and various YouTube videos. GoMAvatar matches or surpasses current monocular human modeling algorithms in rendering quality and significantly outperforms them in computational efficiency (43 FPS) while being memory-efficient (3.63 MB per subject).

Related Material


[pdf] [supp] [arXiv]
[bibtex]
@InProceedings{Wen_2024_CVPR, author = {Wen, Jing and Zhao, Xiaoming and Ren, Zhongzheng and Schwing, Alexander G. and Wang, Shenlong}, title = {GoMAvatar: Efficient Animatable Human Modeling from Monocular Video Using Gaussians-on-Mesh}, booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)}, month = {June}, year = {2024}, pages = {2059-2069} }