A Benchmark for Chinese-English Scene Text Image Super-Resolution

Jianqi Ma, Zhetong Liang, Wangmeng Xiang, Xi Yang, Lei Zhang; Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2023, pp. 19452-19461


Scene Text Image Super-resolution (STISR) aims to recover high-resolution (HR) scene text images with visually pleasant and readable text content from the given low-resolution (LR) input. Most existing works focus on recovering English texts, which have simple structures in the characters, while little work has been done on the more challenging Chinese texts with diverse and complex character structures. In this paper, we propose a real-world Chinese-English benchmark dataset, namely Real-CE, for the task of STISR with the emphasis on restoring structurally complex Chinese characters. The benchmark provides 1,935/783 real-world LR-HR text image pairs (contains 33,789 text lines in total) for training/testing in 2x and 4x zooming modes, complemented by detailed annotations, including detection boxes and text transcripts. Moreover, we design an edge-aware learning method, which provides structural supervision in image and feature domain, to effectively reconstruct the dense structures of Chinese characters. We conduct experiments on the proposed Real-CE benchmark and evaluate the existing STISR models with and without our edge-aware loss. The benchmark, including data and source code, will be made publicly available.

Related Material

[pdf] [supp] [arXiv]
@InProceedings{Ma_2023_ICCV, author = {Ma, Jianqi and Liang, Zhetong and Xiang, Wangmeng and Yang, Xi and Zhang, Lei}, title = {A Benchmark for Chinese-English Scene Text Image Super-Resolution}, booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)}, month = {October}, year = {2023}, pages = {19452-19461} }