-
[pdf]
[supp]
[bibtex]@InProceedings{Qi_2025_CVPR, author = {Qi, Ding and Li, Jian and Gao, Junyao and Dou, Shuguang and Tai, Ying and Hu, Jianlong and Zhao, Bo and Wang, Yabiao and Wang, Chengjie and Zhao, Cairong}, title = {Towards Universal Dataset Distillation via Task-Driven Diffusion}, booktitle = {Proceedings of the Computer Vision and Pattern Recognition Conference (CVPR)}, month = {June}, year = {2025}, pages = {10557-10566} }
Towards Universal Dataset Distillation via Task-Driven Diffusion
Abstract
Dataset distillation (DD) condenses key information from large-scale datasets into smaller synthetic datasets, reducing storage and computational costs for training networks. However, recent research has primarily focused on image classification tasks, with limited expansion to detection and segmentation. Two key challenges remain: (i) Task Optimization Heterogeneity, where existing methods focus on class-level information and fail to address the diverse needs of detection and segmentation and (ii) Inflexible Image Generation, where current generation methods rely on global updates for single-class targets and lack localized optimization for specific object regions.To address these challenges, we propose a universal dataset distillation framework, named UniDD, a task-driven diffusion model for diverse DD tasks, as illustrated in Fig.1. Our approach operates in two stages: Universal Task Knowledge Mining, which captures task-relevant information through task-specific proxy model training, and Universal Task-Driven Diffusion, where these proxies guide the diffusion process to generate task-specific synthetic images.Extensive experiments across ImageNet-1K, Pascal VOC, and MS COCO demonstrate that UniDD consistently outperforms state-of-the-art methods. In particular, on ImageNet-1K with IPC-10, UniDD surpasses previous diffusion-based methods by 6.1%, while also reducing deployment costs.
Related Material