Point-JEPA: A Joint Embedding Predictive Architecture for Self-Supervised Learning on Point Cloud

Saito, Ayumu; Kudeshia, Prachi; Poovvancheri, Jiju

Ayumu Saito, Prachi Kudeshia, Jiju Poovvancheri; Proceedings of the Winter Conference on Applications of Computer Vision (WACV), 2025, pp. 7348-7357

Abstract

Recent advancements in self-supervised learning in the point cloud domain have demonstrated significant potential. However these methods often suffer from drawbacks such as lengthy pre-training time the necessity of reconstruction in the input space and the necessity of additional modalities. In order to address these issues we introduce Point-JEPA a joint embedding predictive architecture designed specifically for point cloud data. To this end we introduce a sequencer that orders point cloud patch embeddings to efficiently compute and utilize their proximity based on their indices during target and context selection. The sequencer also allows shared computations of the patch embeddings' proximity between context and target selection further improving the efficiency. Experimentally our method demonstrates state-of-the-art performance while avoiding the reconstruction in the input space or additional modality. In particular Point-JEPA attains a classification accuracy of 93.7% for linear SVM on ModelNet40 surpassing all other self-supervised models. Moreover Point-JEPA also establishes new state-of-the-art performance levels across all four few-shot learning evaluation frameworks. The code is available at https://github.com/Ayumu-JS/Point-JEPA

Related Material

[pdf] [supp]

[bibtex]

@InProceedings{Saito_2025_WACV, author = {Saito, Ayumu and Kudeshia, Prachi and Poovvancheri, Jiju}, title = {Point-JEPA: A Joint Embedding Predictive Architecture for Self-Supervised Learning on Point Cloud}, booktitle = {Proceedings of the Winter Conference on Applications of Computer Vision (WACV)}, month = {February}, year = {2025}, pages = {7348-7357} }