-
[pdf]
[bibtex]@InProceedings{Kim_2025_WACV, author = {Kim, Sangyeon and Lee, Sangkuk and Kim, Jeesoo and Kwak, Nojun}, title = {TPD-STR: Text Polygon Detection with Split Transformers}, booktitle = {Proceedings of the Winter Conference on Applications of Computer Vision (WACV)}, month = {February}, year = {2025}, pages = {8940-8949} }
TPD-STR: Text Polygon Detection with Split Transformers
Abstract
Regressing text in natural scenes with polygonal representations is challenging due to shape prediction difficulties. To address this we introduce Text Polygon Detection with Split Transformers (TPD-STR) which directly regresses polygonal points. TPD-STR incorporates the Decoder Split (DS) architecture to separate polygonal point regression and textness classification and the Positional Information Propagation (PIP) module to enhance classification. Both modules are effective and compatible with existing methods. TPD-STR achieves state-of-the-art (SOTA) performance among regression-based methods surpassing segmentation-based methods on MSRA-TD500 without external data. Adding DS and PIP to existing models further improves performance. Experiments demonstrate the model's ability to detect text instances effectively.
Related Material