Multi-Scale Voxel Class Balanced ASPP for LIDAR Pointcloud Semantic Segmentation

K. S. Chidanand, Samir Al-stouhi; Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) Workshops, 2021, pp. 117-124

Abstract


This paper explores efficient techniques to improve PolarNet model performance to address the real-time semantic segmentation of LiDAR point clouds. The core framework consists of an encoder network, Atrous spatial pyramid pooling(ASPP)/Dense Atrous spatial pyramid pooling(DenseASPP) followed by a decoder network. Encoder extracts multi-scale voxel information in a top-down manner while decoder fuses multiple feature maps from various scales in a bottom-up manner. In between encoder and decoder block, an ASPP/DenseASPP block is inserted to enlarge receptive fields in a very dense manner. In contrast to PolarNet model, we use weighted cross entropy in conjunction with Lovasz-softmax loss to improve segmentation accuracy. Also this paper accelerates training mechanism of PolarNet model by incorporating learning-rate schedulers in conjunction with Adam optimizer for faster convergence with fewer epochs without degrading accuracy. Extensive experiments conducted on challenging SemanticKITTI dataset shows that our high-resolution-grid model obtains competitive state-of-art result of 60.6 mIOU @21fps whereas our low-resolution-grid model obtains 54.01 mIOU @35fps thereby balancing accuracy/speed trade-off.

Related Material


[pdf]
[bibtex]
@InProceedings{Chidanand_2021_WACV, author = {Chidanand, K. S. and Al-stouhi, Samir}, title = {Multi-Scale Voxel Class Balanced ASPP for LIDAR Pointcloud Semantic Segmentation}, booktitle = {Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) Workshops}, month = {January}, year = {2021}, pages = {117-124} }