Unsupervised Semantic Segmentation Through Depth-Guided Feature Correlation and Sampling

Leon Sick, Dominik Engel, Pedro Hermosilla, Timo Ropinski; Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024, pp. 3637-3646

Abstract


Traditionally training neural networks to perform semantic segmentation requires expensive human-made annotations. But more recently advances in the field of unsupervised learning have made significant progress on this issue and towards closing the gap to supervised algorithms. To achieve this semantic knowledge is distilled by learning to correlate randomly sampled features from images across an entire dataset. In this work we build upon these advances by incorporating information about the structure of the scene into the training process through the use of depth information. We achieve this by (1) learning depth-feature correlation by spatially correlating the feature maps with the depth maps to induce knowledge about the structure of the scene and (2) exploiting farthest-point sampling to more effectively select relevant features by utilizing 3D sampling techniques on depth information of the scene. Finally we demonstrate the effectiveness of our technical contributions through extensive experimentation and present significant improvements in performance across multiple benchmark datasets.

Related Material


[pdf] [supp] [arXiv]
[bibtex]
@InProceedings{Sick_2024_CVPR, author = {Sick, Leon and Engel, Dominik and Hermosilla, Pedro and Ropinski, Timo}, title = {Unsupervised Semantic Segmentation Through Depth-Guided Feature Correlation and Sampling}, booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)}, month = {June}, year = {2024}, pages = {3637-3646} }