An End-to-End Approach for Handwriting Recognition: From Handwritten Text Lines to Complete Pages

Dayvid Castro, Byron Leite Dantas Bezerra, Cleber Zanchettin; Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2024, pp. 264-273

Abstract


Handwritten Document Recognition (HDR) has emerged as a challenging task integrating text and layout information recognition to tackle manuscripts end-to-end. Despite advancements the computational efficiency of processing entire documents remains a critical challenge limiting the practical applicability of these models. This paper presents the Document Attention Network for Computationally Efficient Recognition (DANCER). The model differs from existing approaches with its unique encoder-decoder structure where the encoder reduces spatial redundancy and enhances spatial attention and the decoder comprising transformer layers efficiently decodes the text using optimized attention operations. This design results in a fast memory-efficient model capable of effectively transcribing and understanding complex manuscript layouts. We evaluated DANCER's efficacy on the ICFHR 2016 READ competition dataset focusing on recognizing single and double-page historical documents. We demonstrate how DANCER can triple the training batch size compared to prior models within the same memory limits and reduce memory usage by up to 65% without compromising recognition quality. The proposed approach sets new standards in efficiency and accuracy for HDR solutions paving the way for practical and scalable applications in diverse contexts.

Related Material


[pdf] [supp]
[bibtex]
@InProceedings{Castro_2024_CVPR, author = {Castro, Dayvid and Bezerra, Byron Leite Dantas and Zanchettin, Cleber}, title = {An End-to-End Approach for Handwriting Recognition: From Handwritten Text Lines to Complete Pages}, booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops}, month = {June}, year = {2024}, pages = {264-273} }