- [pdf] [supp]
DAT: Training Deep Networks Robust To Label-Noise by Matching the Feature Distributions
In real application scenarios, the performance of deep networks may be degraded when the dataset contains noisy labels. Existing methods for learning with noisy labels are limited by two aspects. Firstly, methods based on the noise probability modeling can only be applied to class-level noisy labels. Secondly, others based on the memorization effect outperform in synthetic noise but get weak promotion in real-world noisy datasets. To solve these problems, this paper proposes a novel label-noise robust method named Discrepant Adversarial Training (DAT). The DAT method has ability of enforcing prominent feature extraction by matching feature distribution between clean and noisy data. Therefore, under the noise-free feature representation, the deep network can simply output the correct result. To better capture the divergence between the noisy and clean distribution, a new metric is designed to change the distribution divergence into computable. By minimizing the proposed metric with a min-max training of discrepancy on classifiers and generators, DAT can match noisy data to clean data in the feature space. To the best of our knowledge, DAT is the first to address the noisy label problem from the perspective of the feature distribution. Experiments on synthetic and real-world noisy datasets demonstrate that DAT can consistently outperform other state-of-the-art methods. Codes are available at https://github.com/Tyqnn0323/DAT.