-
[pdf]
[supp]
[bibtex]@InProceedings{Islam_2025_WACV, author = {Islam, Nahid Ul and Ma, DongAo and Pang, Jiaxuan and Velan, Shivasakthi Senthil and Gotway, Michael and Liang, Jianming}, title = {Foundation X: Integrating Classification Localization and Segmentation through Lock-Release Pretraining Strategy for Chest X-ray Analysis}, booktitle = {Proceedings of the Winter Conference on Applications of Computer Vision (WACV)}, month = {February}, year = {2025}, pages = {3647-3656} }
Foundation X: Integrating Classification Localization and Segmentation through Lock-Release Pretraining Strategy for Chest X-ray Analysis
Abstract
Developing robust and versatile deep-learning models is essential for enhancing diagnostic accuracy and guiding clinical interventions in medical imaging but it requires a large amount of annotated data. The advancement of deep learning has facilitated the creation of numerous medical datasets with diverse expert-level annotations. Aggregating these datasets can maximize data utilization and address the inadequacy of labeled data. However the heterogeneity of expert-level annotations across tasks such as classification localization and segmentation presents a significant challenge for learning from these datasets. To this end we introduce Foundation X an end-to-end framework that utilizes diverse expert-level annotations from numerous public datasets to train a foundation model capable of multiple tasks including classification localization and segmentation. To address the challenges of annotation and task heterogeneity we propose a Lock-Release pretraining strategy to enhance the cyclic learning from multiple datasets combined with the student-teacher learning paradigm ensuring the model retains general knowledge for all tasks while preventing overfitting to any single task. To demonstrate the effectiveness of Foundation X we trained a model using 11 chest X-ray datasets covering annotations for classification localization and segmentation tasks. Our experimental results show that Foundation X achieves notable performance gains through extensive annotation utilization excels in cross-dataset and cross-task learning and further enhances performance in organ localization and segmentation tasks. All code and pretrained models are publicly accessible at GitHub.com/JLiangLab/Foundation_X.
Related Material