-
[pdf]
[supp]
[arXiv]
[bibtex]@InProceedings{Liu_2026_CVPR, author = {Liu, Han and Georgescu, Bogdan and Zhang, Yanbo and Yoo, Youngjin and Baumgartner, Michael and Gao, Riqiang and Wang, Jianing and Zhao, Gengyan and Gibson, Eli and Comaniciu, Dorin and Grbic, Sasa}, title = {Revisiting 2D Foundation Models for Scalable 3D Medical Image Classification}, booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)}, month = {June}, year = {2026}, pages = {30021-30031} }
Revisiting 2D Foundation Models for Scalable 3D Medical Image Classification
Abstract
3D medical image classification is essential to modern clinical workflows. Medical foundation models (FMs) have emerged as a promising approach for scaling to new tasks, yet current research suffers from three critical pitfalls: data-regime bias, suboptimal adaptation, and insufficient task coverage. In this paper, we address these pitfalls and introduce AnyMC3D, a scalable 3D classifier adapted from 2D FMs. It allows efficient scaling to new tasks by adding only lightweight plugins ( 1M parameters per task) to a single frozen backbone. Besides, this versatile framework also supports multi-view inputs, auxiliary pixel-level supervision, and interpretable heatmap generation. We establish a comprehensive benchmark of 12 tasks covering diverse pathologies, anatomies, and modalities and systematically evaluate state-of-the-art 3D classification techniques. Our analysis reveals several key insights: (1) effective adaptation is critical to unlock FM potential, (2) general-purpose FMs can match medical-specific FMs if properly adapted, and (3) 2D-based methods surpass 3D architectures for 3D classification. For the first time, we demonstrate the feasibility of achieving state-of-the-art performance across diverse applications using a single scalable framework (e.g., 1st place in the *** challenge), eliminating the need for separate task-specific 3D models.
Related Material

