TRH2TQA: Table Recognition with Hierarchical Relationships to Table Question-Answering on Business Table Images

Pongsakorn Jirachanchaisiri, Nam Tuan Ly, Atsuhiro Takasu; Proceedings of the Winter Conference on Applications of Computer Vision (WACV), 2025, pp. 8826-8834

Abstract


Despite advancements in visual question answering challenges persist with documents like financial reports often structured in complicated tabular structures with complex numerical computations. An alternative approach the pipeline-driven methodology includes table recognition (TR) and table question-answering (TQA). Recent advancements in TR support this approach with better accuracy and interpretability. However real-world tables usually represent hierarchical tables. They pose additional challenges due to merged cells and indents necessitating a specific approach for hierarchical relationship extraction. In this paper we propose TRH2TQA (Table Recognition with Hierarchical Relationships to Table Question-Answering) for business table images. It consists of three modules on table images with question-answer pairs. First the TR module extracts structure and textual content from table images into HTML format. Second post-structure extraction is applied to identify header and hierarchical relationships using predicted column span and bounding box. Finally this information is combined with natural language questions in the TQA module to generate the answer through the decoder. In extensive experiments TRH2TQA outperforms in question-answering performance on the VQAonBD 2023 dataset.

Related Material


[pdf]
[bibtex]
@InProceedings{Jirachanchaisiri_2025_WACV, author = {Jirachanchaisiri, Pongsakorn and Ly, Nam Tuan and Takasu, Atsuhiro}, title = {TRH2TQA: Table Recognition with Hierarchical Relationships to Table Question-Answering on Business Table Images}, booktitle = {Proceedings of the Winter Conference on Applications of Computer Vision (WACV)}, month = {February}, year = {2025}, pages = {8826-8834} }