DocSynthv2: A Practical Autoregressive Modeling for Document Generation

Biswas, Sanket; Jain, Rajiv; Morariu, Vlad I.; Gu, Jiuxiang; Mathur, Puneet; Wigington, Curtis; Sun, Tong; Lladós, Josep

Sanket Biswas, Rajiv Jain, Vlad I. Morariu, Jiuxiang Gu, Puneet Mathur, Curtis Wigington, Tong Sun, Josep Lladós; Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2024, pp. 8148-8153

Abstract

While the generation of document layouts has been extensively explored comprehensive document generation--encompassing both layout and content--presents a more complex challenge. This paper delves into this advanced domain proposing a novel approach called DocSynthv2 through the development of a simple yet effective autoregressive structured model. Our model distinct in its integration of both layout and textual cues marks a step beyond existing layout-generation approaches. By focusing on the relationship between the structural elements and the textual content within documents we aim to generate cohesive and contextually relevant documents without any reliance on visual components. Through experimental studies on our curated benchmark for the new task we demonstrate the ability of our model combining layout and textual information in enhancing the generation quality and relevance of documents opening new pathways for research in document creation and automated design. Our findings emphasize the effectiveness of autoregressive models in handling complex document generation tasks.

Related Material

[pdf] [arXiv]

[bibtex]

@InProceedings{Biswas_2024_CVPR, author = {Biswas, Sanket and Jain, Rajiv and Morariu, Vlad I. and Gu, Jiuxiang and Mathur, Puneet and Wigington, Curtis and Sun, Tong and Llad\'os, Josep}, title = {DocSynthv2: A Practical Autoregressive Modeling for Document Generation}, booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops}, month = {June}, year = {2024}, pages = {8148-8153} }