Gau-Occ: Geometry-Completed Gaussians for Multi-Modal 3D Occupancy Prediction

Lv, Chengxin; Li, Yihui; Yang, Hongyu; Wang, YunHong

Chengxin Lv, Yihui Li, Hongyu Yang, YunHong Wang; Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2026, pp. 14198-14207

Abstract

3D semantic occupancy prediction is crucial for autonomous driving, yet vision-only approaches suffer from weak geometric cues, and existing multi-modal frameworks often depend on dense voxel or BEV tensors that impose heavy computational cost. We present Gau-Occ, a multi-modal framework that models the scene as a compact collection of semantic 3D Gaussians, enabling geometry-guided fusion without dense volumetric processing. To enhance geometric completeness, a learned LiDAR Completion Diffuser (LCD) trained on real-world priors recovers missing structures from sparse LiDAR, and the completed points are encoded as semantic Gaussian anchors. To further integrate multi-view image semantics, we introduce Gaussian Anchor Fusion (GAF), a geometry-aligned aggregation module that performs anchor-guided 2D sampling, local neighborhood encoding, and cross-modal alignment. By constructing locally aggregated Gaussian descriptors that capture spatial consistency and semantic discriminability, GAF facilitates accurate feature association across modalities. Through anchor-driven refinement of Gaussian attributes, Occ-GS supports detailed 3D occupancy prediction. Extensive experiments across challenging benchmarks demonstrate that Occ-GS achieves state-of-the-art performance.

Related Material

[pdf] [supp] [arXiv]

[bibtex]

@InProceedings{Lv_2026_CVPR, author = {Lv, Chengxin and Li, Yihui and Yang, Hongyu and Wang, YunHong}, title = {Gau-Occ: Geometry-Completed Gaussians for Multi-Modal 3D Occupancy Prediction}, booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)}, month = {June}, year = {2026}, pages = {14198-14207} }