Rethinking End-to-End 2D to 3D Scene Segmentation in Gaussian Splatting

Runsong Zhu, Shi Qiu, Zhengzhe Liu, Ka-Hei Hui, Qianyi Wu, Pheng-Ann Heng, Chi-Wing Fu; Proceedings of the Computer Vision and Pattern Recognition Conference (CVPR), 2025, pp. 3656-3665

Abstract


Lifting multi-view 2D instance segmentation to a radiance field has proven effective to enhance 3D understanding. Existing works rely on direct matching for end-to-end lifting, yielding inferior results, or employ a two-stage solution constrained by complex pre- or post-processing. In this work, we design Unified-Lift, a new end-to-end object-aware lifting approach that aims for high-quality 3D segmentation based on our object-aware 3D Gaussian representation. To start, we augment each Gaussian point with a Gaussian-level feature learned using a contrastive loss to encode instance information. Importantly, we introduce a learnable object-level codebook to account for individual objects in the scene for an explicit object-level understanding and associate the encoded object-level features with the Gaussian-level point features for segmentation predictions. While promising, achieving effective codebook learning is nontrivial and a naive solution leads to degraded performance. Hence, we formulate the association learning module and the noisy label filtering module for effective and robust codebook learning. We conduct experiments on three benchmarks LERF-Masked, Replica, and Messy Rooms. Both qualitative and quantitative results manifest that our Unified-Lift clearly outperforms existing methods in terms of segmentation quality and time efficiency.

Related Material


[pdf] [supp] [arXiv]
[bibtex]
@InProceedings{Zhu_2025_CVPR, author = {Zhu, Runsong and Qiu, Shi and Liu, Zhengzhe and Hui, Ka-Hei and Wu, Qianyi and Heng, Pheng-Ann and Fu, Chi-Wing}, title = {Rethinking End-to-End 2D to 3D Scene Segmentation in Gaussian Splatting}, booktitle = {Proceedings of the Computer Vision and Pattern Recognition Conference (CVPR)}, month = {June}, year = {2025}, pages = {3656-3665} }