Codebook VQ-VAE Approach for Prostate Cancer Diagnosis using Multiparametric MRI

Ekaterina Redekop, Mara Pleasure, Zichen Wang, Karthik V Sarma, Adam Kinnaird, William Speier, Corey W Arnold; Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2024, pp. 2365-2372

Abstract


Multiparametric magnetic resonance imaging (mpMRI) plays an essential role in prostate cancer diagnosis as it can noninvasively localize and grade lesions based on their suspicion of representing clinically significant prostate cancer (csPCa). With the development of deep learning automatic solutions for csPCa detection based on mpMRI have been developed; however mpMRI data introduces several difficulties including data scarcity heterogeneity in image quality across institutions and missing modalities. This work addresses these difficulties by building a radiology-based foundational model for prostate cancer mpMRI.Foundation models are deep learning models pre-trained on a large-scale dataset and they have recently gained significant interest in computer vision and natural language applications. After pretraining these models are often adapted for a variety of downstream tasks using smaller datasets from within the same domain. In this work a large prostate multiparametric MRI (mpMRI) dataset was collected by combining data from our institution with two publicly available datasets. Joint modeling of all mpMRI modalities is essential for accurate prostate cancer diagnosis; however some of these modalities may be missing. Using unsupervised learning we pretrained modality-specific vector quantized variational autoencoders (VQ-VAE) to form a radiology foundational model. The learned codebook from VQ-VAE was then used to train a multimodal transformer to perform the diagnosis of clinically significant prostate cancer (csPCa). The proposed multimodal transformer models long-range dependencies between latent representations of input modalities and is augmented with modality-level dropout to increase the model robustness to incomplete modalities. Our framework outperforms previously published work and achieves an average AUC/sensitivity/specificity of 0.764/0.690/0.781. Our results show that pretraining on a larger dataset in combination with the power of transformer architecture can improve the accuracy of automatic prostate cancer detection.

Related Material


[pdf]
[bibtex]
@InProceedings{Redekop_2024_CVPR, author = {Redekop, Ekaterina and Pleasure, Mara and Wang, Zichen and Sarma, Karthik V and Kinnaird, Adam and Speier, William and Arnold, Corey W}, title = {Codebook VQ-VAE Approach for Prostate Cancer Diagnosis using Multiparametric MRI}, booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops}, month = {June}, year = {2024}, pages = {2365-2372} }