Exploit Clues From Views: Self-Supervised and Regularized Learning for Multiview Object Recognition

Chih-Hui Ho, Bo Liu, Tz-Ying Wu, Nuno Vasconcelos; Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2020, pp. 9090-9100

Abstract


Multiview recognition has been well studied in the literature and achieves decent performance in object recognition and retrieval task. However, most previous works rely on supervised learning and some impractical underlying assumptions, such as the availability of all views in training and inference time. In this work, the problem of multiview self-supervised learning (MV-SSL) is investigated, where only image to object association is given. Given this setup, a novel surrogate task for self-supervised learning is proposed by pursuing "object invariant" representation. This is solved by randomly selecting an image feature of an object as object prototype, accompanied with multiview consistency regularization, which results in view invariant stochastic prototype embedding (VISPE). Experiments shows that the recognition and retrieval results using VISPE outperform that of other self-supervised learning methods on seen and unseen data. VISPE can also be applied to semi-supervised scenario and demonstrates robust performance with limited data available. Code is available at https://github.com/chihhuiho/VISPE

Related Material


[pdf] [supp] [arXiv]
[bibtex]
@InProceedings{Ho_2020_CVPR,
author = {Ho, Chih-Hui and Liu, Bo and Wu, Tz-Ying and Vasconcelos, Nuno},
title = {Exploit Clues From Views: Self-Supervised and Regularized Learning for Multiview Object Recognition},
booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
month = {June},
year = {2020}
}