Scene text detection and recognition: The deep learning era

S Long, X He, C Yao - International Journal of Computer Vision, 2021 - Springer
With the rise and development of deep learning, computer vision has been tremendously
transformed and reshaped. As an important research area in computer vision, scene text …

Text recognition in the wild: A survey

X Chen, L Jin, Y Zhu, C Luo, T Wang - ACM Computing Surveys (CSUR), 2021 - dl.acm.org
The history of text can be traced back over thousands of years. Rich and precise semantic
information carried by text is important in a wide range of vision-based application …

Docvqa: A dataset for vqa on document images

M Mathew, D Karatzas… - Proceedings of the IEEE …, 2021 - openaccess.thecvf.com
We present a new dataset for Visual Question Answering (VQA) on document images called
DocVQA. The dataset consists of 50,000 questions defined on 12,000+ document images …

Mask textspotter: An end-to-end trainable neural network for spotting text with arbitrary shapes

P Lyu, M Liao, C Yao, W Wu… - Proceedings of the …, 2018 - openaccess.thecvf.com
Recently, models based on deep neural networks have dominated the fields of scene text
detection and recognition. In this paper, we investigate the problem of scene text spotting …

Aster: An attentional scene text recognizer with flexible rectification

B Shi, M Yang, X Wang, P Lyu, C Yao… - IEEE transactions on …, 2018 - ieeexplore.ieee.org
A challenging aspect of scene text recognition is to handle text with distortions or irregular
layout. In particular, perspective text and curved text are common in natural scenes and are …

Esir: End-to-end scene text recognition via iterative image rectification

F Zhan, S Lu - Proceedings of the IEEE/CVF conference on …, 2019 - openaccess.thecvf.com
Automated recognition of texts in scenes has been a research challenge for years, largely
due to the arbitrary text appearance variation in perspective distortion, text line curvature …

Moran: A multi-object rectified attention network for scene text recognition

C Luo, L Jin, Z Sun - Pattern Recognition, 2019 - Elsevier
Irregular text is widely used. However, it is considerably difficult to recognize because of its
various shapes and distorted patterns. In this paper, we thus propose a multi-object rectified …

Textcaps: a dataset for image captioning with reading comprehension

O Sidorov, R Hu, M Rohrbach, A Singh - … 23–28, 2020, Proceedings, Part II …, 2020 - Springer
Image descriptions can help visually impaired people to quickly understand the image
content. While we made significant progress in automatically describing images and optical …

Tap: Text-aware pre-training for text-vqa and text-caption

Z Yang, Y Lu, J Wang, X Yin… - Proceedings of the …, 2021 - openaccess.thecvf.com
In this paper, we propose Text-Aware Pre-training (TAP) for Text-VQA and Text-Caption
tasks. These two tasks aim at reading and understanding scene text in images for question …

Focusing attention: Towards accurate text recognition in natural images

Z Cheng, F Bai, Y Xu, G Zheng… - Proceedings of the …, 2017 - openaccess.thecvf.com
Scene text recognition has been a hot research topic in computer vision due to its various
applications. The state of the art is the attention-based encoder-decoder framework that …