-
[pdf]
[bibtex]@InProceedings{Kulkarni_2025_ICCV, author = {Kulkarni, Shreyas and Kumar, Vivek and Minz, Remish Leonard and Varshney, Munender and Samon, Thiruvengadam and Mitra, Abhishek and Kulkarni, Nikhil and Chakravortty, Nilanjan and Mital, Prateek and Banerjee, Kingshuk}, title = {Enhancing Circuit Diagram Understanding via Near Sight Correction Using VLMs}, booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) Workshops}, month = {October}, year = {2025}, pages = {4234-4242} }
Enhancing Circuit Diagram Understanding via Near Sight Correction Using VLMs
Abstract
Automated circuit diagram understanding is essential for circuit digitization, circuit design verification, education etc. Despite its practical importance, current methods struggle with interpreting circuit diagrams reliably. Recent advancements in Vision-Language Models (VLMs) have enabled significant progress in tasks such as Visual Question Answering (VQA), but VLMs still fail to capture basic visual relationships, such as line or circle intersections, that are essential in circuit schematics. In this work, We evaluate state-of-the-art VLMs for circuit diagram understanding and confirm that it face challenges in accurately identifying circuit connections. We propose Near Sight Correction (NSC), a pipeline to transform the circuit diagram into a more meaningful and enhanced circuit diagram by utilizing key elements. This pipeline automatically labels the connection key points in the original diagram. Thereafter we ingest it in a VLM for circuit understanding, either directly through graph generation or through additional VQA tasks. We evaluate our approach in three settings, (i) VQA on circuit components, (ii) VQA on circuit connections, and (iii) directly through graph generation. We name the three settings as Circuit VQA, Connection VQA and Connection Matrix Prediction task, respectively. All three settings are evaluated on adaptations of circuitvqa dataset [14]. Circuit VQA task achieves an accuracy of 87.38%. Connection VQA task and Connection Matrix Prediction task achieves the best F1 scores of 0.723 and 0.8735 respectively.
Related Material
