From two to one: A new scene text recognizer with visual language modeling network

Y Wang, H Xie, S Fang, J Wang… - Proceedings of the …, 2021 - openaccess.thecvf.com
In this paper, we abandon the dominant complex language model and rethink the linguistic
learning process in the scene text recognition. Different from previous methods considering …

Towards accurate scene text recognition with semantic reasoning networks

D Yu, X Li, C Zhang, T Liu, J Han… - Proceedings of the …, 2020 - openaccess.thecvf.com
Scene text image contains two levels of contents: visual texture and semantic information.
Although the previous scene text recognition methods have made great progress over the …

Revisiting scene text recognition: A data perspective

Q Jiang, J Wang, D Peng, C Liu… - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com
This paper aims to re-assess scene text recognition (STR) from a data-oriented perspective.
We begin by revisiting the six commonly used benchmarks in STR and observe a trend of …

Svtr: Scene text recognition with a single visual model

Y Du, Z Chen, C Jia, X Yin, T Zheng, C Li, Y Du… - arXiv preprint arXiv …, 2022 - arxiv.org
Dominant scene text recognition models commonly contain two building blocks, a visual
model for feature extraction and a sequence model for text transcription. This hybrid …

On vocabulary reliance in scene text recognition

Z Wan, J Zhang, L Zhang, J Luo… - Proceedings of the …, 2020 - openaccess.thecvf.com
The pursuit of high performance on public benchmarks has been the driving force for
research in scene text recognition, and notable progresses have been achieved. However, a …

Read like humans: Autonomous, bidirectional and iterative language modeling for scene text recognition

S Fang, H Xie, Y Wang, Z Mao… - Proceedings of the …, 2021 - openaccess.thecvf.com
Linguistic knowledge is of great benefit to scene text recognition. However, how to effectively
model linguistic rules in end-to-end deep networks remains a research challenge. In this …

Master: Multi-aspect non-local network for scene text recognition

N Lu, W Yu, X Qi, Y Chen, P Gong, R Xiao, X Bai - Pattern Recognition, 2021 - Elsevier
Attention-based scene text recognizers have gained huge success, which leverages a more
compact intermediate representation to learn 1d-or 2d-attention by a RNN-based encoder …

Symmetrical linguistic feature distillation with clip for scene text recognition

Z Wang, H Xie, Y Wang, J Xu, B Zhang… - Proceedings of the 31st …, 2023 - dl.acm.org
In this paper, we explore the potential of the Contrastive Language-Image Pretraining (CLIP)
model in scene text recognition (STR), and establish a novel Symmetrical Linguistic Feature …

Multi-modal text recognition networks: Interactive enhancements between visual and semantic features

B Na, Y Kim, S Park - European Conference on Computer Vision, 2022 - Springer
Linguistic knowledge has brought great benefits to scene text recognition by providing
semantics to refine character sequences. However, since linguistic knowledge has been …

Pimnet: a parallel, iterative and mimicking network for scene text recognition

Z Qiao, Y Zhou, J Wei, W Wang, Y Zhang… - Proceedings of the 29th …, 2021 - dl.acm.org
Nowadays, scene text recognition has attracted more and more attention due to its various
applications. Most state-of-the-art methods adopt an encoder-decoder framework with …