Learning Better Visual Data Similarities via New Grouplet Non-Euclidean Embedding

Yanfu Zhang, Lei Luo, Wenhan Xian, Heng Huang; Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2021, pp. 9918-9927

Abstract


In many computer vision problems, it is desired to learn the effective visual data similarity such that the prediction accuracy can be enhanced. Deep Metric Learning (DML) methods have been actively studied to measure the data similarity. Pair-based and proxy-based losses are the two major paradigms in DML. However, pair-wise methods involve expensive training costs, while proxy-based methods are less accurate in characterizing the relationships between data points. In this paper, we provide a hybrid grouplet paradigm, which inherits the accurate pair-wise relationship in pair-based methods and the efficient training in proxy-based methods. Our method also equips a non-Euclidean space to DML, which employs a hierarchical representation manifold. More specifically, we propose a unified graph perspective --- different DML methods learn different local connecting patterns between data points. Based on the graph interpretation, we construct a flexible subset of data points, dubbed grouplet. Our grouplet doesn't require explicit pair-wise relationships, instead, we encode the data relationships in an optimal transport problem regarding the proxies, and solve this problem via a differentiable implicit layer to automatically determine the relationships. Extensive experimental results show that our method significantly outperforms state-of-the-art baselines on several benchmarks. The ablation studies also verify the effectiveness of our method.

Related Material


[pdf]
[bibtex]
@InProceedings{Zhang_2021_ICCV, author = {Zhang, Yanfu and Luo, Lei and Xian, Wenhan and Huang, Heng}, title = {Learning Better Visual Data Similarities via New Grouplet Non-Euclidean Embedding}, booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)}, month = {October}, year = {2021}, pages = {9918-9927} }