Decoder Pre-Training with only Text for Scene Text Recognition

S Zhao, Y Du, Z Chen, YG Jiang - arXiv preprint arXiv:2408.05706, 2024 - arxiv.org
text in natural scenes, known as scene text recognition (STR), is regarded as a core task of
optical character recognition … Recently, we observe that visual-language models like CLIP [48]…

Master: Multi-aspect non-local network for scene text recognition

N Lu, W Yu, X Qi, Y Chen, P Gong, R Xiao, X Bai - Pattern Recognition, 2021 - Elsevier
… [13] propose to ensemble attention and language models in an attention-based architecture.
… Images are generated from side-view angle snapshots in Google Street View. Therefore, …

Locate then generate: Bridging vision and language with bounding box for scene-text vqa

Y Zhu, Z Liu, Y Liang, X Li, H Liu, C Bao… - Proceedings of the AAAI …, 2023 - ojs.aaai.org
… (b) Scene text recognition mistakes in the STVQA task. … out the correct scene text words, we
design a language refinement network based on a pre-trained language model to distinguish …

VL-Reader: Vision and Language Reconstructor is an Effective Scene Text Recognizer

H Zhong, Z Yang, Z Li, P Wang, J Tang, W Cheng… - ACM Multimedia … - openreview.net
scene text recognition approach by tuning a vision and language reconstructor to a text
Our approach, based on mask and reconstruction, not only learns rich visual and semantic …

2d positional embedding-based transformer for scene text recognition

Z Raisi, MA Naiel, P Fieguth… - … Vision and Imaging …, 2020 - openjournals.uwaterloo.ca
recognition accuracy remains far from expectations. The Transformer’s architecture [25] is a
novel framework, introduced first for natural language … exist in natural language processing …

Synthetically supervised feature learning for scene text recognition

Y Liu, Z Wang, H Jin, I Wassell - … on Computer Vision  …, 2018 - openaccess.thecvf.com
… to most deep learning based text recognition models. 2. We design a novel scene text
recognition algorithm that learns a descriptive and robust text representation (image feature) …

Primitive representation learning for scene text recognition

R Yan, L Peng, S Xiao, G Yao - … and pattern recognition, 2021 - openaccess.thecvf.com
Scene text recognition is a challenging task due to diverse variations of text instances in
natural scene … efficient feature representations for multi-oriented scene texts. In this paper, we …

Kiss: Keeping it simple for scene text recognition

C Bartz, J Bethge, H Yang, C Meinel - arXiv preprint arXiv:1911.08400, 2019 - arxiv.org
… a new model for scene text recognition that only consists of … scene text recognition network
that does not use any building blocks especially designed for the task of scene text recognition

Scene text recognition: an Indic perspective

VP Vijayan, S Chanda, D Doermann… - … Analysis and Recognition …, 2024 - Springer
visual features and languagelanguage Tamil, Malayalam, and Telugu scene text images
that are part of the IIIT-ILST dataset. The scripts for these languages share similar character

Abinet++: Autonomous, bidirectional and iterative language modeling for scene text spotting

S Fang, Z Mao, H Xie, Y Wang, C Yan… - IEEE transactions on …, 2022 - ieeexplore.ieee.org
… Finally, for the VM in scene text recognition, we develop a position attention module to make
visual predictions parallelly based on character order, which employs a U-Net to enhance …