Unbiased Metric Learning: On the Utilization of Multiple Datasets and Web Images for Softening Bias

Chen Fang, Ye Xu, Daniel N. Rockmore; The IEEE International Conference on Computer Vision (ICCV), 2013, pp. 1657-1664

Abstract


Many standard computer vision datasets exhibit biases due to a variety of sources including illumination condition, imaging system, and preference of dataset collectors. Biases like these can have downstream effects in the use of vision datasets in the construction of generalizable techniques, especially for the goal of the creation of a classification system capable of generalizing to unseen and novel datasets. In this work we propose Unbiased Metric Learning (UML), a metric learning approach, to achieve this goal. UML operates in the following two steps: (1) By varying hyperparameters, it learns a set of less biased candidate distance metrics on training examples from multiple biased datasets. The key idea is to learn a neighborhood for each example, which consists of not only examples of the same category from the same dataset, but those from other datasets. The learning framework is based on structural SVM. (2) We do model validation on a set of weakly-labeled web images retrieved by issuing class labels as keywords to search engine. The metric with best validation performance is selected. Although the web images sometimes have noisy labels, they often tend to be less biased, which makes them suitable for the validation set in our task. Cross-dataset image classification experiments are carried out. Results show significant performance improvement on four well-known computer vision datasets.

Related Material


[pdf]
[bibtex]
@InProceedings{Fang_2013_ICCV,
author = {Fang, Chen and Xu, Ye and Rockmore, Daniel N.},
title = {Unbiased Metric Learning: On the Utilization of Multiple Datasets and Web Images for Softening Bias},
booktitle = {The IEEE International Conference on Computer Vision (ICCV)},
month = {December},
year = {2013}
}