Parsing Table Structures in the Wild

Rujiao Long, Wen Wang, Nan Xue, Feiyu Gao, Zhibo Yang, Yongpan Wang, Gui-Song Xia; Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2021, pp. 944-952

Abstract


This paper tackles the problem of table structure pars-ing (TSP) from images in the wild. In contrast to existingstudies that mainly focus on parsing well-aligned tabularimages with simple layouts from scanned PDF documents,we aim to establish a practical table structure parsing sys-tem for real-world scenarios where tabular input imagesare taken or scanned with severe deformation, bending orocclusions. For designing such a system, we propose anapproach named Cycle-CenterNet on the top of CenterNetwith a novel cycle-pairing module to simultaneously detectand group tabular cells into structured tables. In the cycle-pairing module, a new pairing loss function is proposed forthe network training. Alongside with our Cycle-CenterNet,we also present a large-scale dataset, named Wired Tablein the Wild (WTW), which includes well-annotated structureparsing of multiple style tables in several scenes like photo,scanning files, web pages,etc.. In experiments, we demon-strate that our Cycle-CenterNet consistently achieves thebest accuracy of table structure parsing on the new WTWdataset by 24.6% absolute improvement evaluated by theTEDS metric. A more comprehensive experimental analysisalso validates the advantages of our proposed methods forthe TSP task.

Related Material


[pdf] [supp] [arXiv]
[bibtex]
@InProceedings{Long_2021_ICCV, author = {Long, Rujiao and Wang, Wen and Xue, Nan and Gao, Feiyu and Yang, Zhibo and Wang, Yongpan and Xia, Gui-Song}, title = {Parsing Table Structures in the Wild}, booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)}, month = {October}, year = {2021}, pages = {944-952} }