DocMatcher: Document Image Dewarping via Structural and Textual Line Matching

Hertlein, Felix; Naumann, Alexander; Sure-Vetter, York

Felix Hertlein, Alexander Naumann, York Sure-Vetter; Proceedings of the Winter Conference on Applications of Computer Vision (WACV), 2025, pp. 5771-5780

Abstract

Document image dewarping is a crucial step in the digitization of physical documents as it aims to remove the distortions induced by challenging environment settings and document sheet deformations often encountered when using smartphone cameras for image capture. Recently deep learning-based methods were combined with knowledge about the expected document structure also known as a template at inference time to improve the dewarping results. Our contributions in this work are threefold: (1) we propose a novel document image dewarping approach that leverages the prior knowledge about the document structure effectively by detecting and matching lines from the warped and the template domain and (2) we introduce a novel evaluation metric called matched normalized character error rate (mnCER) to overcome the limitations of existing metrics in evaluating the dewarping process. (3) Finally we evaluate our approach on the Inv3DReal dataset and show that our approach outperforms the state-of-the-art methods in terms of visual and text-based metrics. Our approach improves upon the state-of-the-art methods by 32.6% in Local Distortion and 40.2% in mnCER. Our code and models are available at https://felixhertlein.github.io/doc-matcher.

Related Material

[pdf]

[bibtex]

@InProceedings{Hertlein_2025_WACV, author = {Hertlein, Felix and Naumann, Alexander and Sure-Vetter, York}, title = {DocMatcher: Document Image Dewarping via Structural and Textual Line Matching}, booktitle = {Proceedings of the Winter Conference on Applications of Computer Vision (WACV)}, month = {February}, year = {2025}, pages = {5771-5780} }