Disentangled Clothed Avatar Generation with Layered Representation

Zhang, Weitian; Yan, Yichao; Wu, Sijing; Liao, Manwen; Yang, Xiaokang

Weitian Zhang, Yichao Yan, Sijing Wu, Manwen Liao, Xiaokang Yang; Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2025, pp. 11327-11338

Abstract

Clothed avatar generation has wide applications in virtual and augmented reality, filmmaking, and more. While existing methods have made progress in creating animatable digital avatars, generating avatars with disentangled components (e.g., body, hair, and clothes) has long been a challenge. In this paper, we propose LayerAvatar, a novel feed-forward diffusion-based method capable of generating high-quality component-disentangled clothed avatars in seconds. We propose a layered UV feature plane representation, where components are distributed in different layers of the Gaussian-based UV feature plane with corresponding semantic labels. This representation can be effectively learned with current feed-forward generation pipelines, facilitating component disentanglement and enhancing details of generated avatars. Based on the well-designed representation, we train a single-stage diffusion model and introduce constrain terms to mitigate the severe occlusion issue of the innermost human body layer. Extensive experiments demonstrate the superior performances of our method in generating highly detailed and disentangled clothed avatars. In addition, we explore its applications in component transfer. The project page is available at https://olivia23333.github.io/LayerAvatar.

Related Material

[pdf] [supp] [arXiv]

[bibtex]

@InProceedings{Zhang_2025_ICCV, author = {Zhang, Weitian and Yan, Yichao and Wu, Sijing and Liao, Manwen and Yang, Xiaokang}, title = {Disentangled Clothed Avatar Generation with Layered Representation}, booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)}, month = {October}, year = {2025}, pages = {11327-11338} }