PICTURE: PhotorealistIC virtual Try-on from UnconstRained dEsigns

Shuliang Ning, Duomin Wang, Yipeng Qin, Zirong Jin, Baoyuan Wang, Xiaoguang Han; Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024, pp. 6976-6985

Abstract


In this paper we propose a novel virtual try-on from unconstrained designs (ucVTON) task to enable photorealistic synthesis of personalized composite clothing on input human images. Unlike prior arts constrained by specific input types our method allows flexible specification of style (text or image) and texture (full garment cropped sections or texture patches) conditions. To address the entanglement challenge when using full garment images as conditions we develop a two-stage pipeline with explicit disentanglement of style and texture. In the first stage we generate a human parsing map reflecting the desired style conditioned on the input. In the second stage we composite textures onto the parsing map areas based on the texture input. To represent complex and non-stationary textures that have never been achieved in previous fashion editing works we first propose extracting hierarchical and balanced CLIP features and applying position encoding in VTON. Experiments demonstrate superior synthesis quality and personalization enabled by our method. The flexible control over style and texture mixing brings virtual try-on to a new level of user experience for online shopping and fashion design.

Related Material


[pdf] [supp] [arXiv]
[bibtex]
@InProceedings{Ning_2024_CVPR, author = {Ning, Shuliang and Wang, Duomin and Qin, Yipeng and Jin, Zirong and Wang, Baoyuan and Han, Xiaoguang}, title = {PICTURE: PhotorealistIC virtual Try-on from UnconstRained dEsigns}, booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)}, month = {June}, year = {2024}, pages = {6976-6985} }