Adapting Style and Content for Attended Text Sequence Recognition

Schwarcz, Steven; Gorban, Alexander; Gibert, Xavier; Lee, Dar-Shyang

Steven Schwarcz, Alexander Gorban, Xavier Gibert, Dar-Shyang Lee; Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), 2020, pp. 1597-1606

Abstract

In this paper, we address the problem of learning to perform sequential OCR on photos of street name signs in a language for which no labeled data exists. Our approach leverages easily-generated synthetic data and existing labeled data in other languages to achieve reasonable performance on these unlabeled images, through a combination of a novel domain adaptation technique based on gradient reversal and a multi-task learning scheme. In order to accomplish this, we introduce and release two new datasets - Hebrew Street Name Signs (HSNS) and Synthetic Hebrew Street Name Signs (SynHSNS) - while also making use of the existing French Street Name Signs (FSNS) dataset. We demonstrate that by using a synthetic dataset of Hebrew characters and a labeled dataset of French street name signs in natural images, it is possible to achieve a significant improvement on real Hebrew street name sign transcription, where the synthetic Hebrew data and real French data each overlap with different features of the images we wish to transcribe.

Related Material

[pdf] [supp]

[bibtex]

@InProceedings{Schwarcz_2020_WACV,
author = {Schwarcz, Steven and Gorban, Alexander and Gibert, Xavier and Lee, Dar-Shyang},
title = {Adapting Style and Content for Attended Text Sequence Recognition},
booktitle = {Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)},
month = {March},
year = {2020}
}