Learning Feature Representations for Look-Alike Images

Ayca Takmaz, Thomas Probst, Danda Pani Paudel, Luc Van Gool; Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2019, pp. 21-24


Human perception of visual similarity relies on information varying from low-level features such as texture and color, to high-level features such as objects and elements. While generic features learned for image or face recognition tasks somewhat correlate with the perceived visual similarity, they are found to be inadequate for matching look-alike images. In this paper, we learn the 'look-alike feature' embedding, capable of representing the perceived image similarity, by fusing low- and high-level features within a modified CNN encoder architecture. This encoder is trained using the triplet loss paradigm on look-alike image pairs. Our findings demonstrate that combining features from different layers across the network is beneficial for look-alike image matching, and clearly outperforms the standard pretrained networks followed by finetuning. Furthermore, we show that the learned similarities are meaningful, and capture color, shape, facial or holistic appearance patterns, depending upon context and image modalities.

Related Material

author = {Takmaz, Ayca and Probst, Thomas and Pani Paudel, Danda and Van Gool, Luc},
title = {Learning Feature Representations for Look-Alike Images},
booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops},
month = {June},
year = {2019}