Multi-Oriented Text Detection With Fully Convolutional Networks

Zheng Zhang, Chengquan Zhang, Wei Shen, Cong Yao, Wenyu Liu, Xiang Bai; Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016, pp. 4159-4167

Abstract


In this paper, we propose an unconventional approach for text detection in natural images. Both global and local cues are taken into account for localizing text lines in a coarse-to-fine procedure. First, a Fully Convolutional Network (FCN) model is trained for predicting a salient map of text regions in a holistic manner. Then, a set of hypotheses text lines are estimated by combining the salient map and MSER components. Finally, another FCN classifier is used for predicting the centroid of each character, in order to remove the false hypotheses. The framework is general for handling texts in multiple orientations, languages and fonts. The proposed method consistently achieves the state-of-the-art performance on three text detection benchmarks: MSRA-TD500, ICDAR2015, and ICDAR2013.

Related Material


[pdf] [supp]
[bibtex]
@InProceedings{Zhang_2016_CVPR,
author = {Zhang, Zheng and Zhang, Chengquan and Shen, Wei and Yao, Cong and Liu, Wenyu and Bai, Xiang},
title = {Multi-Oriented Text Detection With Fully Convolutional Networks},
booktitle = {Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
month = {June},
year = {2016}
}