Multimodal Gaussian Process Latent Variable Models With Harmonization

Guoli Song, Shuhui Wang, Qingming Huang, Qi Tian; The IEEE International Conference on Computer Vision (ICCV), 2017, pp. 5029-5037


In this work, we address multimodal learning problem with Gaussian process latent variable models (GPLVMs) and their application to cross-modal retrieval. Existing GPLVM based studies generally impose individual priors over the model parameters and ignore the intrinsic relations among these parameters. Considering the strong complementarity between modalities, we propose a novel joint prior over the parameters for multimodal GPLVMs to propagate multimodal information in both kernel hyperparameter spaces and latent space. The joint prior is formulated as a harmonization constraint on the model parameters, which enforces the agreement among the modality-specific GP kernels and the similarity in the latent space. We incorporate the harmonization mechanism into the learning process of multimodal GPLVMs. The proposed methods are evaluated on three widely used multimodal datasets for cross-modal retrieval. Experimental results show that the harmonization mechanism is beneficial to the GPLVM algorithms for learning non-linear correlation among heterogeneous modalities.

Related Material

author = {Song, Guoli and Wang, Shuhui and Huang, Qingming and Tian, Qi},
title = {Multimodal Gaussian Process Latent Variable Models With Harmonization},
booktitle = {The IEEE International Conference on Computer Vision (ICCV)},
month = {Oct},
year = {2017}