ALADIN: All Layer Adaptive Instance Normalization for Fine-Grained Style Similarity

Dan Ruta, Saeid Motiian, Baldo Faieta, Zhe Lin, Hailin Jin, Alex Filipkowski, Andrew Gilbert, John Collomosse; Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2021, pp. 11926-11935

Abstract


We present ALADIN (All Layer AdaIN); a novel architecture for searching images based on the similarity of their artistic style. Representation learning is critical to visual search, where distance in the learned search embedding reflects image similarity. Learning an embedding that discriminates fine-grained variations in style is hard, due to the difficulty of defining and labelling style. ALADIN takes a weakly supervised approach to learning a representation for fine-grained style similarity of digital artworks, leveraging BAM-FG, a novel large-scale dataset of user generated content groupings gathered from the web. ALADIN sets a new state of the art accuracy for style-based visual search over both coarse labelled style data (BAM) and BAM-FG; a new 2.62 million image dataset of 310,000 fine-grained style groupings also contributed by this work.

Related Material


[pdf] [supp] [arXiv]
[bibtex]
@InProceedings{Ruta_2021_ICCV, author = {Ruta, Dan and Motiian, Saeid and Faieta, Baldo and Lin, Zhe and Jin, Hailin and Filipkowski, Alex and Gilbert, Andrew and Collomosse, John}, title = {ALADIN: All Layer Adaptive Instance Normalization for Fine-Grained Style Similarity}, booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)}, month = {October}, year = {2021}, pages = {11926-11935} }