ChartReader: A Unified Framework for Chart Derendering and Comprehension without Heuristic Rules

Zhi-Qi Cheng, Qi Dai, Alexander G. Hauptmann; Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2023, pp. 22202-22213

Abstract


Charts are a powerful tool for visually conveying complex data, but their comprehension poses a challenge due to the diverse chart types and intricate components. Existing chart comprehension methods suffer from either heuristic rules or an over-reliance on OCR systems, resulting in suboptimal performance. To address these issues, we present ChartReader, a unified framework that seamlessly integrates chart derendering and comprehension tasks. Our approach includes a transformer-based chart component detection module and an extended pre-trained vision-language model for chart-to-X tasks. By learning the rules of charts automatically from annotated datasets, our approach eliminates the need for manual rule-making, reducing effort and enhancing accuracy. We also introduce a data variable replacement technique and extend the input and position embeddings of the pre-trained model for cross-task training. We evaluate ChartReader on Chart-to-Table, ChartQA, and Chart-to-Text tasks, demonstrating its superiority over existing methods. Our proposed framework can significantly reduce the manual effort involved in chart analysis, providing a step towards a universal chart understanding model. Moreover, our approach offers opportunities for plug-and-play integration with mainstream LLMs such as T5 and TaPas, extending their capability to chart comprehension tasks.

Related Material


[pdf] [arXiv]
[bibtex]
@InProceedings{Cheng_2023_ICCV, author = {Cheng, Zhi-Qi and Dai, Qi and Hauptmann, Alexander G.}, title = {ChartReader: A Unified Framework for Chart Derendering and Comprehension without Heuristic Rules}, booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)}, month = {October}, year = {2023}, pages = {22202-22213} }