CusConcept: Customized Visual Concept Decomposition with Diffusion Models

Xu, Zhi; Hao, Shaozhe; Han, Kai

Zhi Xu, Shaozhe Hao, Kai Han; Proceedings of the Winter Conference on Applications of Computer Vision (WACV), 2025, pp. 3678-3687

Abstract

Enabling generative models to decompose visual concepts from a single image is a complex and challenging problem. In this paper we study a new and challenging task customized concept decomposition wherein the objective is to leverage diffusion models to decompose a single image and generate visual concepts from various perspectives. To address this challenge we propose a two-stage framework CusConcept (short for Customized Visual Concept Decomposition) to extract customized visual concept embedding vectors that can be embedded into prompts for text-to-image generation. In the first stage CusConcept employs a vocabulary-guided concept decomposition mechanism to build vocabularies along human-specified conceptual axes. The decomposed concepts are obtained by retrieving corresponding vocabularies and learning anchor weights. In the second stage joint concept refinement is performed to enhance the fidelity and quality of generated images. We further curate an evaluation benchmark for assessing the performance of the open-world concept decomposition task. Our approach can effectively generate high-quality images of the decomposed concepts and produce related lexical predictions as secondary outcomes. Extensive qualitative and quantitative experiments demonstrate the effectiveness of CusConcept. Our code and data are available at https://github.com/xzLcan/CusConcept.

Related Material

[pdf] [supp] [arXiv]

[bibtex]

@InProceedings{Xu_2025_WACV, author = {Xu, Zhi and Hao, Shaozhe and Han, Kai}, title = {CusConcept: Customized Visual Concept Decomposition with Diffusion Models}, booktitle = {Proceedings of the Winter Conference on Applications of Computer Vision (WACV)}, month = {February}, year = {2025}, pages = {3678-3687} }