ProtoTransfer: Cross-Modal Prototype Transfer for Point Cloud Segmentation

Pin Tang, Hai-Ming Xu, Chao Ma; Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2023, pp. 3337-3347

Abstract


Knowledge transfer from multi-modal, i.e., LiDAR points and images, to a single LiDAR modal can take advantage of complimentary information from modal-fusion but keep a single modal inference speed, showing a promising direction for point cloud semantic segmentation in autonomous driving. Recent advances in point cloud segmentation distill knowledge from strictly aligned point-pixel fusion features while leaving a large number of unmatched image pixels unexplored and unmatched LiDAR points under-benefited. In this paper, we propose a novel approach, named ProtoTransfer, which not only fully exploits image representations but also transfers the learned multi-modal knowledge to all point cloud features. Specifically, based on the basic multi-modal learning framework, we build up a class-wise prototype bank from the strictly-aligned fusion features and encourage all the point cloud features to learn from the prototypes during model training. Moreover, to exploit the massive unmatched point and pixel features, we use a pseudo-labeling scheme and further accumulate these features into the class-wise prototype bank with a carefully designed fusion strategy. Without bells and whistles, our approach demonstrates superior performance over the published state-of-the-arts on two large-scale benchmarks, i.e., nuScenes and SemanticKITTI, and ranks 2nd on the competitive nuScenes Lidarseg challenge leaderboard.

Related Material


[pdf] [supp]
[bibtex]
@InProceedings{Tang_2023_ICCV, author = {Tang, Pin and Xu, Hai-Ming and Ma, Chao}, title = {ProtoTransfer: Cross-Modal Prototype Transfer for Point Cloud Segmentation}, booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)}, month = {October}, year = {2023}, pages = {3337-3347} }