Deep Cross-Modal Hashing

Qing-Yuan Jiang, Wu-Jun Li; The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017, pp. 3232-3240


Due to its low storage cost and fast query speed, cross-modal hashing (CMH) has been widely used for similarity search in multimedia retrieval applications. However, most existing CMH methods are based on hand-crafted features which might not be optimally compatible with the hash-code learning procedure. As a result, existing CMH methods with hand-crafted features may not achieve satisfactory performance. In this paper, we propose a novel CMH method, called deep cross-modal hashing (DCMH), by integrating feature learning and hash-code learning intothe same framework. DCMH is an end-to-end learning framework with deep neural networks, one for each modality, to perform feature learning from scratch. Experiments on three real datasets with image-text modalities show that DCMH can outperform other baselines to achieve the state-of-the-art performance in cross-modal retrieval applications.

Related Material

[pdf] [arXiv] [poster] [video]
author = {Jiang, Qing-Yuan and Li, Wu-Jun},
title = {Deep Cross-Modal Hashing},
booktitle = {The IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
month = {July},
year = {2017}