Multi-Label Cross-Modal Retrieval

Viresh Ranjan, Nikhil Rasiwasia, C. V. Jawahar; Proceedings of the IEEE International Conference on Computer Vision (ICCV), 2015, pp. 4094-4102


In this work, we address the problem of cross-modal retrieval in presence of multi-label annotations. In particular, we introduce multi-label Canonical Correlation Analysis (ml-CCA), an extension of CCA, for learning shared subspaces taking into account high level semantic information in the form of multi-label annotations. Unlike CCA, ml-CCA does not rely on explicit pairing between modalities, instead it uses the multi-label information to establish correspondences. This results in a discriminative subspace which is better suited for cross-modal retrieval tasks. We also present Fast ml-CCA, a computationally efficient version of ml-CCA, which is able to handle large scale datasets. We show the efficacy of our approach by conducting extensive cross-modal retrieval experiments on three standard benchmark datasets. The results show that the proposed approach achieves state of the art retrieval performance on the three datasets.

Related Material

author = {Ranjan, Viresh and Rasiwasia, Nikhil and Jawahar, C. V.},
title = {Multi-Label Cross-Modal Retrieval},
booktitle = {Proceedings of the IEEE International Conference on Computer Vision (ICCV)},
month = {December},
year = {2015}