Lost Your Style? Navigating With Semantic-Level Approach for Text-To-Outfit Retrieval

Junkyu Jang, Eugene Hwang, Sung-Hyuk Park; Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), 2024, pp. 8066-8075

Abstract


Fashion stylists have historically bridged the gap between consumers' desires and perfect outfits, which involve intricate combinations of colors, patterns, and materials. Although recent advancements in fashion recommendation systems have made strides in outfit compatibility prediction and complementary item retrieval, these systems rely heavily on pre-selected customer choices. Therefore, we introduce a groundbreaking approach to fashion recommendations: text-to-outfit retrieval task that generates a complete outfit set based solely on textual descriptions given by users. Our model is devised at three semantic levels--item, style, and outfit--where each level progressively aggregates data to form a coherent outfit recommendation based on textual input. Here, we leverage strategies similar to those in the contrastive language-image pretraining model to address the intricate-style matrix within the outfit sets. Using the Maryland Polyvore and Polyvore Outfit datasets, our approach significantly outperformed state-of-the-art models in text-video retrieval tasks, solidifying its effectiveness in the fashion recommendation domain. This research not only pioneers a new facet of fashion recommendation systems, but also introduces a method that captures the essence of individual style preferences through textual descriptions.

Related Material


[pdf] [supp] [arXiv]
[bibtex]
@InProceedings{Jang_2024_WACV, author = {Jang, Junkyu and Hwang, Eugene and Park, Sung-Hyuk}, title = {Lost Your Style? Navigating With Semantic-Level Approach for Text-To-Outfit Retrieval}, booktitle = {Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)}, month = {January}, year = {2024}, pages = {8066-8075} }