Foundation X: Integrating Classification Localization and Segmentation through Lock-Release Pretraining Strategy for Chest X-ray Analysis

Nahid Ul Islam, DongAo Ma, Jiaxuan Pang, Shivasakthi Senthil Velan, Michael Gotway, Jianming Liang; Proceedings of the Winter Conference on Applications of Computer Vision (WACV), 2025, pp. 3647-3656

Abstract


Developing robust and versatile deep-learning models is essential for enhancing diagnostic accuracy and guiding clinical interventions in medical imaging but it requires a large amount of annotated data. The advancement of deep learning has facilitated the creation of numerous medical datasets with diverse expert-level annotations. Aggregating these datasets can maximize data utilization and address the inadequacy of labeled data. However the heterogeneity of expert-level annotations across tasks such as classification localization and segmentation presents a significant challenge for learning from these datasets. To this end we introduce Foundation X an end-to-end framework that utilizes diverse expert-level annotations from numerous public datasets to train a foundation model capable of multiple tasks including classification localization and segmentation. To address the challenges of annotation and task heterogeneity we propose a Lock-Release pretraining strategy to enhance the cyclic learning from multiple datasets combined with the student-teacher learning paradigm ensuring the model retains general knowledge for all tasks while preventing overfitting to any single task. To demonstrate the effectiveness of Foundation X we trained a model using 11 chest X-ray datasets covering annotations for classification localization and segmentation tasks. Our experimental results show that Foundation X achieves notable performance gains through extensive annotation utilization excels in cross-dataset and cross-task learning and further enhances performance in organ localization and segmentation tasks. All code and pretrained models are publicly accessible at GitHub.com/JLiangLab/Foundation_X.

Related Material


[pdf] [supp]
[bibtex]
@InProceedings{Islam_2025_WACV, author = {Islam, Nahid Ul and Ma, DongAo and Pang, Jiaxuan and Velan, Shivasakthi Senthil and Gotway, Michael and Liang, Jianming}, title = {Foundation X: Integrating Classification Localization and Segmentation through Lock-Release Pretraining Strategy for Chest X-ray Analysis}, booktitle = {Proceedings of the Winter Conference on Applications of Computer Vision (WACV)}, month = {February}, year = {2025}, pages = {3647-3656} }