SoREL: Soft-Label Refurbishment with Ensemble Learning for Noisy Long-Tailed Classification

Hsieh, Jun Wei; Wu, Ying-Hsuan; Hsieh, Yi-Kuan; Li, Xin; Peng, Kuan-Chuan; Chang, Ming-Ching

Jun Wei Hsieh, Ying-Hsuan Wu, Yi-Kuan Hsieh, Xin Li, Kuan-Chuan Peng, Ming-Ching Chang; Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Findings, 2026, pp. 6839-6848

Abstract

Real-world datasets often suffer from both noisy labels and long-tailed distributions, where rare classes are more prone to annotation errors. Existing methods typically address these issues separately or rely on unreliable noise pre-screening, leading to biased learning and unstable optimization. We propose Soft-label Refurbishment with Ensemble Learning (SoREL), a two-stage framework that jointly handles label noise and class imbalance. In the first stage, \ours performs robust soft-label refurbishment via contrastive learning for unbiased representation learning and a Balanced Noise-tolerant Cross-entropy (BANC) loss for stable pre-screening. In the second stage, refurbished soft labels guide multi-expert ensemble learning, where experts specialize in many-, medium-, and few-shot classes. Soft-label-based class statistics further refine loss weighting to better match the true data distribution. Experiments on simulated and real-world noisy long-tailed datasets demonstrate that \ours achieves 91.80%/67.59% on CIFAR-10/100-LT and 77.74% / 81.40% on Food-101N and Animal-10N, significantly outperforming prior methods.Real-world datasets often suffer from both noisy labels and long-tailed distributions, where rare classes are more prone to annotation errors. Existing methods typically address these issues separately or rely on unreliable noise pre-screening, leading to biased learning and unstable optimization. We propose Soft-label Refurbishment with Ensemble Learning (SoREL), a two-stage framework that jointly handles label noise and class imbalance. In the first stage, \ours performs robust soft-label refurbishment via contrastive learning for unbiased representation learning and a Balanced Noise-tolerant Cross-entropy (BANC) loss for stable pre-screening. In the second stage, refurbished soft labels guide multi-expert ensemble learning, where experts specialize in many-, medium-, and few-shot classes. Soft-label-based class statistics further refine loss weighting to better match the true data distribution. Experiments on simulated and real-world noisy long-tailed datasets demonstrate that \ours achieves 91.80%/67.59% on CIFAR-10/100-LT and 77.74% / 81.40% on Food-101N and Animal-10N, significantly outperforming prior methods.

Related Material

[pdf]

[bibtex]

@InProceedings{Hsieh_2026_CVPR, author = {Hsieh, Jun Wei and Wu, Ying-Hsuan and Hsieh, Yi-Kuan and Li, Xin and Peng, Kuan-Chuan and Chang, Ming-Ching}, title = {SoREL: Soft-Label Refurbishment with Ensemble Learning for Noisy Long-Tailed Classification}, booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Findings}, month = {June}, year = {2026}, pages = {6839-6848} }