DiffuseMix: Label-Preserving Data Augmentation with Diffusion Models

Khawar Islam, Muhammad Zaigham Zaheer, Arif Mahmood, Karthik Nandakumar; Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024, pp. 27621-27630

Abstract


Recently a number of image-mixing-based augmentation techniques have been introduced to improve the generalization of deep neural networks. In these techniques two or more randomly selected natural images are mixed together to generate an augmented image. Such methods may not only omit important portions of the input images but also introduce label ambiguities by mixing images across labels resulting in misleading supervisory signals. To address these limitations we propose DIFFUSEMIX a novel data augmentation technique that leverages a diffusion model to reshape training images supervised by our bespoke conditional prompts. First concatenation of a partial natural image and its generated counterpart is obtained which helps in avoiding the generation of unrealistic images or label ambiguities. Then to enhance resilience against adversarial attacks and improves safety measures a randomly selected structural pattern from a set of fractal images is blended into the concatenated image to form the final augmented image for training. Our empirical results on seven different datasets reveal that DIFFUSEMIX achieves superior performance compared to existing state- of-the-art methods on tasks including general classification fine-grained classification fine-tuning data scarcity and adversarial robustness.

Related Material


[pdf] [supp] [arXiv]
[bibtex]
@InProceedings{Islam_2024_CVPR, author = {Islam, Khawar and Zaheer, Muhammad Zaigham and Mahmood, Arif and Nandakumar, Karthik}, title = {DiffuseMix: Label-Preserving Data Augmentation with Diffusion Models}, booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)}, month = {June}, year = {2024}, pages = {27621-27630} }