-
[pdf]
[bibtex]@InProceedings{Kocasari_2022_CVPR, author = {Kocasari, Umut and Zaman, Kerem and Tiftikci, Mert and Simsar, Enis and Yanardag, Pinar}, title = {Rank in Style: A Ranking-Based Approach To Find Interpretable Directions}, booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops}, month = {June}, year = {2022}, pages = {2294-2298} }
Rank in Style: A Ranking-Based Approach To Find Interpretable Directions
Abstract
Recent work such as StyleCLIP aims to harness the power of CLIP embeddings for controlled manipulations. Although these models are capable of manipulating images based on a text prompt, the success of the manipulation often depends on careful selection of the appropriate text for the desired manipulation. This limitation makes it particularly difficult to perform text-based manipulations in domains where the user lacks expertise, such as fashion. To address this problem, we propose a method for automatically determining the most successful and relevant text-based edits using a pre-trained StyleGAN model. Our approach consists of a novel mechanism that uses CLIP to guide beam-search decoding, and a ranking method that identifies the most relevant and successful edits based on a list of keywords. We also demonstrate the capabilities of our framework in several domains, including fashion.
Related Material