ALOcc: Adaptive Lifting-Based 3D Semantic Occupancy and Cost Volume-Based Flow Predictions

Dubing Chen, Jin Fang, Wencheng Han, Xinjing Cheng, Junbo Yin, Chengzhong Xu, Fahad Shahbaz Khan, Jianbing Shen; Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2025, pp. 4156-4166

Abstract


3D semantic occupancy and flow prediction are fundamental to spatiotemporal scene understanding. This paper proposes a vision-based framework with three targeted improvements. First, we introduce an occlusion-aware adaptive lifting mechanism incorporating depth denoising. This enhances the robustness of 2D-to-3D feature transformation while mitigating reliance on depth priors. Second, we enforce 3D-2D semantic consistency via jointly optimized prototypes, using confidence- and category-aware sampling to address the long-tail classes problem. Third, to streamline joint prediction, we devise a BEV-centric cost volume to explicitly correlate semantic and flow features, supervised by a hybrid classification-regression scheme that handles diverse motion scales. Our purely convolutional architecture establishes new SOTA performance on multiple benchmarks for both semantic occupancy and joint occupancy semantic-flow prediction. We also present a family of models offering a spectrum of efficiency-performance trade-offs. Our real-time version exceeds all existing real-time methods in speed and accuracy, ensuring its practical viability.

Related Material


[pdf] [supp] [arXiv]
[bibtex]
@InProceedings{Chen_2025_ICCV, author = {Chen, Dubing and Fang, Jin and Han, Wencheng and Cheng, Xinjing and Yin, Junbo and Xu, Chengzhong and Khan, Fahad Shahbaz and Shen, Jianbing}, title = {ALOcc: Adaptive Lifting-Based 3D Semantic Occupancy and Cost Volume-Based Flow Predictions}, booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)}, month = {October}, year = {2025}, pages = {4156-4166} }