SurfOcc: Surface-based Feature Lifting for Vision-centric 3D Occupancy Prediction

Tonghui Ye, Zhi Gao, Zhipeng Lin, Xinyi Liu, Ronghe Jin; Proceedings of the Asian Conference on Computer Vision (ACCV), 2024, pp. 2421-2438

Abstract


3D occupancy prediction has been an emerging trend in 3D perception for its superiority in preserving exquisite geometric and semantic details. However, existing vision-based approaches either leave features unrefined or neglect depth ambiguity due to defective 2D-to-3D feature lifting modules, leading to imprecise prediction results. In this paper, we introduce SurfOcc, a vision-centric 3D occupancy prediction framework which addresses these limitations fundamentally. SurfOcc decouples the learning process of observed surfaces and occluded regions while seamlessly integrating them into an end-to-end architecture. Specifically, we first propose surface-based feature lifting to precisely locate observed surfaces and enhance the selected surface voxels via cross-attention during feature lifting. Then we design a feature diffuser which incorporates both local and global features to diffuse the reliable surface features to occluded regions. Experiments show that SurfOcc achieves state-of-the-art performance with 13.75 mIoU on SemanticKITTI and 42.38 mIoU on Occ3D-nuScenes, which also demonstrates the potential of SurfOcc in handling occlusion situations. Code is available at https://github.com/sullicsullic/SurfOcc.

Related Material


[pdf]
[bibtex]
@InProceedings{Ye_2024_ACCV, author = {Ye, Tonghui and Gao, Zhi and Lin, Zhipeng and Liu, Xinyi and Jin, Ronghe}, title = {SurfOcc: Surface-based Feature Lifting for Vision-centric 3D Occupancy Prediction}, booktitle = {Proceedings of the Asian Conference on Computer Vision (ACCV)}, month = {December}, year = {2024}, pages = {2421-2438} }