-
[pdf]
[arXiv]
[bibtex]@InProceedings{Prakash_2025_ICCV, author = {Prakash, Eva and Valanarasu, Jeya Maria Jose and Chen, Zhihong and Reis, Eduardo Pontes and Johnston, Andrew and Pareek, Anuj and Bluethgen, Christian and Gatidis, Sergios and Olsen, Cameron and Chaudhari, Akshay S and Ng, Andrew Y. and Langlotz, Curtis}, title = {Evaluating and Improving the Effectiveness of Synthetic Chest X-Rays for Medical Image Analysis}, booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) Workshops}, month = {October}, year = {2025}, pages = {4472-4480} }
Evaluating and Improving the Effectiveness of Synthetic Chest X-Rays for Medical Image Analysis
Abstract
In this work, we explore best-practice approaches for generating synthetic chest X-ray images and augmenting medical imaging datasets to optimize the performance of deep learning models in downstream tasks like classification and segmentation. We utilize a latent diffusion model to condition the generation of synthetic chest X-rays on text prompts and/or segmentation masks. We explore methods such as using a proxy model and incorporating radiologist feedback to improve the quality of synthetic data. These synthetic images are generated from relevant disease information or geometrically-transformed segmentation masks and added to ground truth training set images from the CheXpert, CANDID-PTX, SIIM, and RSNA Pneumonia in order to measure improvements in classification and segmentation model performance on the test sets. F1 and Dice scores are used to evaluate classification and segmentation, respectively. Across all experiments, the synthetic data we generate results in a maximum mean classification F1 score improvement of 0.15 (CI: 0.10, 0.20; P=0.0031) compared to using only real data. For segmentation, the maximum Dice score improvement is 0.14 (CI: 0.11, 0.18; P=0.0064). We find that best practices for generating synthetic chest X-ray images for downstream tasks include conditioning on single-disease labels or geometrically-transformed segmentation masks, as well as using potentially proxy modeling to fine-tune such generations.
Related Material
