InstanceGaussian: Appearance-Semantic Joint Gaussian Representation for 3D Instance-Level Perception

Li, Haijie; Wu, Yanmin; Meng, Jiarui; Gao, Qiankun; Zhang, Zhiyao; Wang, Ronggang; Zhang, Jian

Haijie Li, Yanmin Wu, Jiarui Meng, Qiankun Gao, Zhiyao Zhang, Ronggang Wang, Jian Zhang; Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2025, pp. 14078-14088

Abstract

3D scene understanding is vital for applications in autonomous driving, robotics, and augmented reality. However, scene understanding based on 3D Gaussian Splatting faces three key challenges: (i) an imbalance between appearance and semantics, (ii) inconsistencies in object boundaries, and (iii) difficulties with top-down instance segmentation. To address these challenges, we propose InstanceGaussian, a method that jointly learns appearance and semantic features while adaptively aggregating instances. Our contributions are as follows: (i) a new Semantic-Scaffold-GS representation to improve feature representation and boundary delineation, (ii) a progressive training strategy for enhanced stability and segmentation, and (iii) a category-agnostic, bottom-up instance aggregation approach for better segmentation. Experimental results demonstrate that our approach achieves state-of-the-art performance in category-agnostic, open-vocabulary 3D point-level segmentation, validating the effectiveness of our proposed method.

Related Material

[pdf] [supp] [arXiv]

[bibtex]

@InProceedings{Li_2025_CVPR, author = {Li, Haijie and Wu, Yanmin and Meng, Jiarui and Gao, Qiankun and Zhang, Zhiyao and Wang, Ronggang and Zhang, Jian}, title = {InstanceGaussian: Appearance-Semantic Joint Gaussian Representation for 3D Instance-Level Perception}, booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)}, month = {June}, year = {2025}, pages = {14078-14088} }