Fried Binary Embedding for High-Dimensional Visual Features

Weixiang Hong, Junsong Yuan, Sreyasee Das Bhattacharjee; The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017, pp. 2749-2757


Most existing binary embedding methods prefer compact binary codes (b-dimensional) to avoid high computational and memory cost of projecting high-dimensional visual features (d-dimensional, b < d). We argue that long binary codes (b O(d)) are critical to fully utilize the discriminative power of high-dimensional visual features, and can achieve better results in various tasks such as approximate nearest neighbour search. Generating long binary codes involves large projection matrix and high-dimensional matrix-vector multiplication, thus is memory and compute intensive. To tackle these problems, we propose Fried Binary Embedding (FBE) to decompose the projection matrix using adaptive Fastfood transform, which is the multiplication of several structured matrices. As a result, FBE can reduce the computational complexity from O(d2) to O(dlogd), and memory cost from O(d2) to O(d), respectively. More importantly, by using the structured matrices, FBE can regulate the projection matrix against over-fitting and lead to even better accuracy than using unconstrained projection matrix (like ITQ [4]) with the same long code length. Experimental comparisons with state-of-the-art methods over various visual applications demonstrate both the efficiency and performance advantages of the FBE.

Related Material

author = {Hong, Weixiang and Yuan, Junsong and Das Bhattacharjee, Sreyasee},
title = {Fried Binary Embedding for High-Dimensional Visual Features},
booktitle = {The IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
month = {July},
year = {2017}