-
[pdf]
[arXiv]
[bibtex]@InProceedings{Han_2024_CVPR, author = {Han, Pengxiao and Ye, Changkun and Zhou, Jieming and Zhang, Jing and Hong, Jie and Li, Xuesong}, title = {Latent-based Diffusion Model for Long-tailed Recognition}, booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops}, month = {June}, year = {2024}, pages = {2639-2648} }
Latent-based Diffusion Model for Long-tailed Recognition
Abstract
Long-tailed imbalance distribution is a common issue in practical computer vision applications. Previous works proposed methods to address this problem which can be categorized into several classes: re-sampling re-weighting transfer learning and feature augmentation. In recent years diffusion models have shown an impressive generation ability in many sub-problems of deep computer vision. However its powerful generation has not been explored in long-tailed problems. We propose a new approach the Latent-based Diffusion Model for Long-tailed Recognition (LDMLR) as a feature augmentation method to tackle the issue. First we encode the imbalanced dataset into features using the baseline model. Then we train a Denoising Diffusion Implicit Model (DDIM) using these encoded features to generate pseudo-features. Finally we train the classifier using the encoded and pseudo-features from the previous two steps. The model's accuracy shows an improvement on the CIFAR-LT and ImageNet-LT datasets by using the proposed method.
Related Material