Transcriptomics-guided Slide Representation Learning in Computational Pathology

Jaume, Guillaume; Oldenburg, Lukas; Vaidya, Anurag; Chen, Richard J.; Williamson, Drew F.K.; Peeters, Thomas; Song, Andrew H.; Mahmood, Faisal

Guillaume Jaume, Lukas Oldenburg, Anurag Vaidya, Richard J. Chen, Drew F.K. Williamson, Thomas Peeters, Andrew H. Song, Faisal Mahmood; Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024, pp. 9632-9644

Abstract

Self-supervised learning (SSL) has been successful in building patch embeddings of small histology images (e.g. 224 x 224 pixels) but scaling these models to learn slide embeddings from the entirety of giga-pixel whole-slide images (WSIs) remains challenging. Here we leverage complementary information from gene expression profiles to guide slide representation learning using multi-modal pre-training. Expression profiles constitute highly detailed molecular descriptions of a tissue that we hypothesize offer a strong task-agnostic training signal for learning slide embeddings. Our slide and expression (S+E) pretraining strategy called TANGLE employs modality-specific encoders the outputs of which are aligned via contrastive learning. TANGLE was pre-trained on samples from three different organs: liver (n=6597 S+E pairs) breast (n=1020) and lung (n=1012) from two different species (Homo sapiens and Rattus norvegicus). Across three independent test datasets consisting of 1265 breast WSIs 1946 lung WSIs and 4584 liver WSIs TANGLE shows significantly better few-shot performance compared to supervised and SSL baselines. When assessed using prototype-based classification and slide retrieval TANGLE also shows a substantial performance improvement over all baselines. Code available at https://github.com/mahmoodlab/TANGLE.

Related Material

[pdf] [supp] [arXiv]

[bibtex]

@InProceedings{Jaume_2024_CVPR, author = {Jaume, Guillaume and Oldenburg, Lukas and Vaidya, Anurag and Chen, Richard J. and Williamson, Drew F.K. and Peeters, Thomas and Song, Andrew H. and Mahmood, Faisal}, title = {Transcriptomics-guided Slide Representation Learning in Computational Pathology}, booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)}, month = {June}, year = {2024}, pages = {9632-9644} }