- [pdf] [supp]
AREA: Adaptive Reweighting via Effective Area for Long-Tailed Classification
Large-scale data from real-world usually follow a long-tailed distribution (i.e., a few majority classes occupy plentiful training data, while most minority classes have few samples), making the hyperplanes heavily skewed to the minority classes. Traditionally, reweighting is adopted to make the hyperplanes fairly split the feature space, where the weights are designed according to the number of samples. However, we find that the number of samples in a class can not accurately measure the size of its spanned space, especially for the majority class, where the size of its spanned space is usually larger than the samples' number because of the high diversity. Therefore, weights designed based on the samples' number will still compress the space of minority classes. In this paper, we reconsider reweighting from a totally new perspective of analyzing the spanned space of each class. We argue that, besides statistical numbers, relations between samples are also significant for sufficiently depicting the spanned space. Consequently, we estimate the size of the spanned space for each category, namely effective area, by detailedly analyzing its samples' distribution. By treating samples of a class as identically distributed random variables and analyzing their correlations, a simple and non-parametric formula is derived to estimate the effective area. Then, the weight simply calculated inversely proportional to the effective area of each class is adopted to achieve fairer training. Note that our weights are more flexible as they can be adaptively adjusted along with the optimizing features during training. Experiments on four long-tailed datasets show that the proposed weights outperform the state-of-the-art reweighting methods. Moreover, our method can also achieve better results on statistically balanced CIFAR-10/100. Code is available at https://github.com/xiaohua-chen/AREA.