Improved Information Extraction by Leveraging Multi-Hypothesis OCR at Inference Time

Arthur Hemmer, Nicola Bartolo, Mickaël Coustaty, Jean-Marc Ogier; Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) Workshops, 2025, pp. 7576-7584

Abstract


Extracting information from documents is a critical task for many industrial use-cases. Errors in this process can arise from both the visual recognition and the semantic labeling processes. Traditional approaches to mitigate these errors involve collecting more data and training larger models, which can be resource-intensive. In this paper, we propose a backtracking, constrained decoding approach that aims to correct OCR reading and information extraction at the inference stage without retraining or using additional error correction models. By leveraging OCR confidence scores and top-k predictions, we explore multiple high-probability OCR readings until we reach a constraint-satisfying extraction. Our approach uses computational resources during inference to jointly solve both OCR and extraction issues. We demonstrate the effectiveness of our method on two datasets, showing consistent improvements in F1-score, compared to degradation in extraction performance when using a state-of-the-art post-OCR error correction model specifically fine-tuned on these datasets.

Related Material


[pdf]
[bibtex]
@InProceedings{Hemmer_2025_ICCV, author = {Hemmer, Arthur and Bartolo, Nicola and Coustaty, Micka\"el and Ogier, Jean-Marc}, title = {Improved Information Extraction by Leveraging Multi-Hypothesis OCR at Inference Time}, booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) Workshops}, month = {October}, year = {2025}, pages = {7576-7584} }