-
[pdf]
[supp]
[bibtex]@InProceedings{Li_2025_CVPR, author = {Li, Ming and Zhong, Jike and Chen, Tianle and Lai, Yuxiang and Psounis, Konstantinos}, title = {EEE-Bench: A Comprehensive Multimodal Electrical And Electronics Engineering Benchmark}, booktitle = {Proceedings of the Computer Vision and Pattern Recognition Conference (CVPR)}, month = {June}, year = {2025}, pages = {13337-13349} }
EEE-Bench: A Comprehensive Multimodal Electrical And Electronics Engineering Benchmark
Abstract
Recent studies on large language models (LLMs) and large multimodal models (LMMs) have demonstrated promising skills in various domains including science and mathematics. However, their capability in more challenging and real-world related scenarios like engineering has not been systematically studied. To bridge this gap, we propose EEE-Bench, a multimodal benchmark aimed at assessing LMMs' capabilities in solving practical engineering tasks, using electrical and electronics engineering (EEE) as the testbed. Our benchmark consists of 2860 hand-picked and carefully curated problems spanning 10 essential subdomains such as analog circuits, control systems, etc. Compared to other domains, engineering problems are intrinsically 1) more visually complex and versatile and 2) less deterministic in solutions. Successful solutions to these problems often demand more-than-usual rigorous integration of visual and textual information as models need to understand intricate images like abstract circuits and system diagrams while taking professional instructions. Alongside EEE-Bench, we provide extensive quantitative evaluations, fine-grained analysis, and improvement methods using 17 widely-used open- and closed-sourced LLMs and LMMs and 7 popular prompting techniques. Our results reveal notable deficiencies in current foundation models for EEE, including an average performance ranging from 19.48% to 46.78% and a tendency toward "laziness" in overlooking essential visual context. In summary, we believe EEE-Bench not only reveals some noteworthy limitations of LMMs but also provides a valuable resource for advancing research on their application in practical engineering tasks, driving future improvements in their capability to handle complex, real-world scenarios.
Related Material