Attention-aware Deep Adversarial Hashing for Cross-Modal Retrieval
Xi Zhang, Hanjiang Lai , Jiashi Feng ; Proceedings of the European Conference on Computer Vision (ECCV), 2018, pp. 591-606
Abstract
Due to the rapid growth of multi-modal data, hashing methods for cross-modal retrieval have received considerable attention. However, finding content similarities between different modalities of data is still challenging due to an existing heterogeneity gap. To further address this problem, we propose an adversarial hashing network with an attention mechanism to enhance the measurement of content similarities by selectively focusing on the informative parts of multi-modal data. The proposed new deep adversarial network consists of three building blocks: 1) the feature learning module to obtain the feature representations; 2) the attention module to generate an attention mask, which is used to divide the feature representations into the attended and unattended feature representations; and 3) the hashing module to learn hash functions that preserve the similarities between different modalities. In our framework, the attention and hashing modules are trained in an adversarial way: the attention module attempts to make the hashing module unable to preserve the similarities of multi-modal data w.r.t. the unattended feature representations, while the hashing module aims to preserve the similarities of multi-modal data w.r.t. the attended and unattended feature representations. Extensive evaluations on several benchmark datasets demonstrate that the proposed method brings substantial improvements over other state-of-the-art cross-modal hashing methods.
Related Material
[pdf]
[arXiv]
[
bibtex]
@InProceedings{Zhang_2018_ECCV,
author = {Zhang, Xi and Lai, Hanjiang and Feng, Jiashi},
title = {Attention-aware Deep Adversarial Hashing for Cross-Modal Retrieval},
booktitle = {Proceedings of the European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}