Vim4Path: Self-Supervised Vision Mamba for Histopathology Images

Nasiri-Sarvi, Ali; Trinh, Vincent Quoc-Huy; Rivaz, Hassan; Hosseini, Mahdi S.

Ali Nasiri-Sarvi, Vincent Quoc-Huy Trinh, Hassan Rivaz, Mahdi S. Hosseini; Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2024, pp. 6894-6903

Abstract

Representation learning from Gigapixel Whole Slide Images (WSI) poses a significant challenge in computational pathology due to the complicated nature of tissue structures and the scarcity of labeled data. Multi-instance learning methods have addressed this challenge leveraging image patches to classify slides utilizing pretrained models using Self-Supervised Learning (SSL) approaches. The performance of both SSL and MIL methods relies on the architecture of the feature encoder. This paper proposes leveraging the Vision Mamba (Vim) architecture inspired by state space models within the DINO framework for representation learning in computational pathology. We evaluate the performance of Vim against Vision Transformers (ViT) on the Camelyon16 dataset for both patch-level and slide-level classification. Our findings highlight Vim's enhanced performance compared to ViT particularly at smaller scales where Vim achieves an 8.21 increase in ROC AUC for models of similar size. An explainability analysis further highlights Vim's capabilities which reveals that Vim uniquely emulates the pathologist workflow--unlike ViT. This alignment with human expert analysis highlights Vim's potential in practical diagnostic settings and contributes significantly to developing effective representation-learning algorithms in computational pathology. We release the codes and pretrained weights at \color purple https://github.com/AtlasAnalyticsLab/Vim4Path.

Related Material

[pdf] [arXiv]

[bibtex]

@InProceedings{Nasiri-Sarvi_2024_CVPR, author = {Nasiri-Sarvi, Ali and Trinh, Vincent Quoc-Huy and Rivaz, Hassan and Hosseini, Mahdi S.}, title = {Vim4Path: Self-Supervised Vision Mamba for Histopathology Images}, booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops}, month = {June}, year = {2024}, pages = {6894-6903} }