-
[pdf]
[supp]
[arXiv]
[bibtex]@InProceedings{Xue_2025_WACV, author = {Xue, Yujing and Liu, Jiaxiang and Du, Jiawei and Zhou, Joey Tianyi}, title = {PVP: Polar Representation Boost for 3D Semantic Occupancy Prediction}, booktitle = {Proceedings of the Winter Conference on Applications of Computer Vision (WACV)}, month = {February}, year = {2025}, pages = {2746-2755} }
PVP: Polar Representation Boost for 3D Semantic Occupancy Prediction
Abstract
Recently representations based on polar coordinates have exhibited promising characteristics for 3D perceptual tasks. In addition to Cartesian-based methods representing surrounding spaces through polar grids offers a compelling alternative in these tasks. This approach is advantageous for its ability to represent larger areas while preserving greater detail of nearby spaces. However polar-based methods are inherently challenged by the issue of feature distortion due to the non-uniform division inherent to polar representation. To harness the advantages of polar representation while addressing its challenges we propose Polar Voxel Occupancy Predictor (PVP) a novel 3D multi-modal occupancy predictor operating in polar coordinates. PVP mitigates the issues of feature distortion and misalignment across different modalities with the following two design elements: 1) Global Represent Propagation (GRP) module which incorporates global spatial information into the intermediate 3D volume taking into account the prior spatial structure. It then employs Global Decomposed Attention to accurately propagate features to their correct locations. 2) Plane Decomposed Convolution (PD-Conv) which simplifies 3D distortions in polar coordinates by replacing 3D convolution with a series of 2D convolutions. With these straightforward yet impactful modifications our PVP surpasses state-of-the-art works by significant margins--improving by 1.9% mIoU and 2.9% IoU over LiDAR-only methods and by 7.9% mIoU and 6.8% IoU over multimodal methods on the OpenOccupancy dataset.
Related Material