IDEA-Bench: How Far are Generative Models from Professional Designing?

Liang, Chen; Huang, Lianghua; Fang, Jingwu; Dou, Huanzhang; Wang, Wei; Wu, Zhi-Fan; Shi, Yupeng; Zhang, Junge; Zhao, Xin; Liu, Yu

Chen Liang, Lianghua Huang, Jingwu Fang, Huanzhang Dou, Wei Wang, Zhi-Fan Wu, Yupeng Shi, Junge Zhang, Xin Zhao, Yu Liu; Proceedings of the Computer Vision and Pattern Recognition Conference (CVPR), 2025, pp. 18541-18551

Abstract

Recent advancements in image generation models enable the creation of high-quality images and targeted modifications based on textual instructions. Some models even support multimodal complex guidance and demonstrate robust task generalization capabilities. However, they still fall short of meeting the nuanced, professional demands of designers. To bridge this gap, we introduce IDEA-Bench, a comprehensive benchmark designed to advance image generation models toward applications with robust task generalization. IDEA-Bench comprises 100 professional image generation tasks and 275 specific cases, categorized into five major types based on the current capabilities of existing models. Furthermore, we provide a representative subset of 18 tasks with enhanced evaluation criteria to facilitate more nuanced and reliable evaluations using Multimodal Large Language Models (MLLMs). By assessing models' ability to comprehend and execute novel, complex tasks, IDEA-Bench paves the way toward the development of generative models with autonomous and versatile visual generation capabilities.

Related Material

[pdf] [supp]

[bibtex]

@InProceedings{Liang_2025_CVPR, author = {Liang, Chen and Huang, Lianghua and Fang, Jingwu and Dou, Huanzhang and Wang, Wei and Wu, Zhi-Fan and Shi, Yupeng and Zhang, Junge and Zhao, Xin and Liu, Yu}, title = {IDEA-Bench: How Far are Generative Models from Professional Designing?}, booktitle = {Proceedings of the Computer Vision and Pattern Recognition Conference (CVPR)}, month = {June}, year = {2025}, pages = {18541-18551} }