Robust Dataset Condensation using Supervised Contrastive Learning

Nicole Hee-Yeon Kim, Hwanjun Song; Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2025, pp. 2857-2866

Abstract


Dataset condensation aims to compress large dataset into smaller synthetic set while preserving the essential representations needed for effective model training. However, existing methods show severe performance degradation when applied to noisy datasets. To address this, we present robust dataset condensation (RDC), an end-to-end method that mitigates noise to generate a clean and robust synthetic set, without requiring separate noise-reduction preprocessing steps. RDC refines the condensation process by integrating contrastive learning tailored for robust condensation, named golden MixUp contrast. It uses synthetic samples to sharpen class boundaries and to mitigate noisy representations, while its augmentation strategy compensates for the limited size of the synthetic set by identifying clean samples from noisy training data, enriching synthetic images with real-data diversity. We evaluate RDC against existing condensation methods and a conventional approach that first applies noise cleaning algorithms to the dataset before performing condensation. Extensive experiments show that RDC outperforms other approaches on CIFAR-10/100 across different types of noise, including asymmetric, symmetric, and real-world noise. Code is available at https://github.com/DISL-Lab/RDC-ICCV2025.

Related Material


[pdf] [supp]
[bibtex]
@InProceedings{Kim_2025_ICCV, author = {Kim, Nicole Hee-Yeon and Song, Hwanjun}, title = {Robust Dataset Condensation using Supervised Contrastive Learning}, booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)}, month = {October}, year = {2025}, pages = {2857-2866} }