Sampling Wisely: Deep Image Embedding by Top-K Precision Optimization

Jing Lu, Chaofan Xu, Wei Zhang, Ling-Yu Duan, Tao Mei; Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2019, pp. 7961-7970


Deep image embedding aims at learning a convolutional neural network (CNN) based mapping function that maps an image to a feature vector. The embedding quality is usually evaluated by the performance in image search tasks. Since very few users bother to open the second page search results, top-k precision mostly dominates the user experience and thus is one of the crucial evaluation metrics for the embedding quality. Despite being extensively studied, existing algorithms are usually based on heuristic observation without theoretical guarantee. Consequently, gradient descent direction on the training loss is mostly inconsistent with the direction of optimizing the concerned evaluation metric. This inconsistency certainly misleads the training direction and degrades the performance. In contrast to existing works, in this paper, we propose a novel deep image embedding algorithm with end-to-end optimization to top-k precision, the evaluation metric that is closely related to user experience. Specially, our loss function is constructed with wisely selected "misplaced" images along the top k nearest neighbor decision boundary, so that the gradient descent update directly promotes the concerned metric, top-k precision. Further more, our theoretical analysis on the upper bounding and consistency properties of the proposed loss supports that minimizing our proposed loss is equivalent to maximizing top-k precision. Experiments show that our proposed algorithm outperforms all compared state-of-the-art deep image embedding algorithms on three benchmark datasets.

Related Material

[pdf] [supp]
author = {Lu, Jing and Xu, Chaofan and Zhang, Wei and Duan, Ling-Yu and Mei, Tao},
title = {Sampling Wisely: Deep Image Embedding by Top-K Precision Optimization},
booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},
month = {October},
year = {2019}