Measuring Distance Between Unordered Sets of Different Sizes
Andrew Gardner, Jinko Kanno, Christian A. Duncan, Rastko Selmic; Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2014, pp. 137-143
Abstract
We present a distance metric based upon the notion of minimum-cost injective mappings between sets. Our function satisfies metric properties as long as the cost of the minimum mappings is derived from a semimetric, for which the triangle inequality is not necessarily satisfied. We show that the Jaccard distance (alternatively biotope, Tanimoto, or Marczewski-Steinhaus distance) may be considered the special case for finite sets where costs are derived from the discrete metric. Extensions that allow premetrics (not necessarily symmetric), multisets (generalized to include probability distributions), and asymmetric mappings are given that expand the versatility of the metric without sacrificing metric properties. The function has potential applications in pattern recognition, machine learning, and information retrieval.
Related Material
[pdf]
[
bibtex]
@InProceedings{Gardner_2014_CVPR,
author = {Gardner, Andrew and Kanno, Jinko and Duncan, Christian A. and Selmic, Rastko},
title = {Measuring Distance Between Unordered Sets of Different Sizes},
booktitle = {Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
month = {June},
year = {2014}
}