Information Extraction From Document Images via FCA-Based Template Detection and Knowledge Graph Rule Induction

Rastogi, Mouli; Ali, Syed Afshan; Rawat, Mrinal; Vig, Lovekesh; Agarwal, Puneet; Shroff, Gautam; Srinivasan, Ashwin

Mouli Rastogi, Syed Afshan Ali, Mrinal Rawat, Lovekesh Vig, Puneet Agarwal, Gautam Shroff, Ashwin Srinivasan; Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2020, pp. 558-559

Abstract

We view information extraction from document images as a complex problem that requires a combination of 1) state of the art deep learning vision models for detection of entities and primitive relations, 2) symbolic background knowledge that expresses prior information of spatial and semantic relationships, using the entities and primitive relations from the neural detectors, and 3) learning of symbolic extraction rules using one, or few examples of annotated document images. Several challenges arise in ensuring that this neuro-symbolic software stack works together seamlessly. These include vision-based challenges to ensure that the documents are "seen" at the appropriate level of detail to detect entities; symbolic representation challenges in identifying primitive relations between the entities identified by the vision system; learning-based challenges of identifying the appropriate level of symbolic abstraction for the retrieval rules, the need to identify background knowledge that is relevant to the documents being analyzed, and learning general symbolic rules in data-deficient domains. In this paper, we describe how we meet some of these challenges in the design of our document-reading platform. In particular we focus on use cases with multiple templates which additionally involves finding structurally similar images in large heterogeneous document image collections. An adaptive lattice based template allocation module was utilized for evaluating document similarity based on both textual content and document structure. A knowledge graph is used for capturing document structure and a relational rule learning system is employed on the knowledge graph for generating extraction rules. Experiments on a publicly shared data-set of 1400 trade finance documents demonstrates the viability of the proposed system.

Related Material

[pdf]

[bibtex]

@InProceedings{Rastogi_2020_CVPR_Workshops,
author = {Rastogi, Mouli and Ali, Syed Afshan and Rawat, Mrinal and Vig, Lovekesh and Agarwal, Puneet and Shroff, Gautam and Srinivasan, Ashwin},
title = {Information Extraction From Document Images via FCA-Based Template Detection and Knowledge Graph Rule Induction},
booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops},
month = {June},
year = {2020}
}