Representation Recovering for Self-Supervised Pre-Training on Medical Images

Yan, Xiangyi; Naushad, Junayed; Sun, Shanlin; Han, Kun; Tang, Hao; Kong, Deying; Ma, Haoyu; You, Chenyu; Xie, Xiaohui

Xiangyi Yan, Junayed Naushad, Shanlin Sun, Kun Han, Hao Tang, Deying Kong, Haoyu Ma, Chenyu You, Xiaohui Xie; Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), 2023, pp. 2685-2695

Abstract

Advances in self-supervised learning, especially in contrastive learning, have drawn attention to investigating these techniques in providing effective visual representations from unlabeled images. It enables the models' ability of extracting highly consistent features by generating different views. Due to the recent success of Masked Autoencoders (MAE), an emerging trend of exploring generative modeling in self-supervised learning has come back into sight of the community. The generative approaches encode the input into a compact embedding and empower the models' ability of recovering the original input. However, in our experiments, we found vanilla MAE mainly recovers course high level semantic information and barely recovers detailed low level information. We show that in dense downstream prediction tasks like multi-organ segmentation, directly applying MAE is not ideal. In this paper, we propose RepRec, a hybrid visual representation learning framework for self-supervised pre-training on large-scale unlabelled medical datasets, which takes advantage of both contrastive and generative modeling. In our method, to solve the aforementioned dilemma that MAE encounters, a convolutional encoder is pre-trained to provide low-level feature information, in a contrastive way; and a transformer encoder is pre-trained to produce high level semantic dependency, in a generative way -- by recovering masked representations from the convolutional encoder. Extensive experiments on three multi-organ segmentation datasets demonstrate that our method outperforms current state-of-the-art methods.

Related Material

[pdf]

[bibtex]

@InProceedings{Yan_2023_WACV, author = {Yan, Xiangyi and Naushad, Junayed and Sun, Shanlin and Han, Kun and Tang, Hao and Kong, Deying and Ma, Haoyu and You, Chenyu and Xie, Xiaohui}, title = {Representation Recovering for Self-Supervised Pre-Training on Medical Images}, booktitle = {Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)}, month = {January}, year = {2023}, pages = {2685-2695} }