Euclidean and Hamming Embedding for Image Patch Description With Convolutional Networks

Zishun Liu, Zhenxi Li, Juyong Zhang, Ligang Liu; Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2016, pp. 72-78

Abstract


Local feature descriptors represent image patches as floating-point or binary arrays for computer vision tasks. In this paper, we propose to train Euclidean and Hamming embedding for image patch description with triplet convolutional networks. Thanks to the learning ability of deep ConvNets, the trained local feature generation method, which is called Deeply Learned Feature Transform (DELFT), has good distinctiveness and robustness. Evaluated on the UBC benchmark, we get the state-of-the-art results using floating-point and binary features. Also, the learned features can cooperate with existing nearest neighbor search algorithms in Euclidean and Hamming space. In addition, a new benchmark is constructed to facilitate future related research, which contains 40 million image patches, corresponding to 6.7 million 3D points, being 25 times larger than existing dataset. The distinctiveness and robustness of the proposed method are demonstrated in the experimental results.

Related Material


[pdf]
[bibtex]
@InProceedings{Liu_2016_CVPR_Workshops,
author = {Liu, Zishun and Li, Zhenxi and Zhang, Juyong and Liu, Ligang},
title = {Euclidean and Hamming Embedding for Image Patch Description With Convolutional Networks},
booktitle = {Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops},
month = {June},
year = {2016}
}