Turning a clip model into a scene text detector

W Yu, Y Liu, W Hua, D Jiang… - … Pattern Recognition, 2023 - openaccess.thecvf.com
… on vision language models have made effective progresses in the field of text detection. In
… (3) By turning the CLIP model into existing scene text detection methods, we further achieve …

Image as a Language: Revisiting Scene Text Recognition via Balanced, Unified and Synchronized Vision-Language Reasoning Network

J Wei, H Zhan, Y Lu, X Tu, B Yin, C Liu… - Proceedings of the AAAI …, 2024 - ojs.aaai.org
… How to effectively and jointly model vision and language to … vision-language reasoning
in scene text recognition, we present a balanced, unified and synchronized vision-language

Display-semantic transformer for scene text recognition

X Yang, W Silamu, M Xu, Y Li - Sensors, 2023 - mdpi.com
… help understand visual cues in scene text recognition. Our model designs a masked language
… masked language model, we build a semantic visual interaction module to help the model …

Hierarchical visual-semantic interaction for scene text recognition

L Diao, X Tang, J Wang, G Xie, J Hu - Information Fusion, 2024 - Elsevier
… We have presented a hierarchical visual-semantic interaction (HVSI) … visual information
for scene text recognition, which does not require a pre-training process or external language

Visual semantics allow for textual reasoning better in scene text recognition

Y He, C Chen, J Zhang, J Liu, F He, C Wang… - Proceedings of the AAAI …, 2022 - ojs.aaai.org
… a graph-based context reasoning model that supplements the language model to exploit
both visual spatial context and linguistic context to improve the visual recognition results. …

Revisiting scene text recognition: A data perspective

Q Jiang, J Wang, D Peng, C Liu… - … on computer vision, 2023 - openaccess.thecvf.com
… From Two to One: A new scene text recognizer with visual language modeling network. In
CVPR, pages 14194–14203, 2021. 1, 4, 7 [58] Xudong Xie, Ling Fu, Zhifei Zhang, Zhaowen …

Review of scene text detection and recognition

H Lin, P Yang, F Zhang - Archives of computational methods in …, 2020 - Springer
… , scene text detection is a challenging problem. Similar to majority of computer vision tasks,
most previous text detection … With text recognition, techniques related to language model and …

MaskOCR: Scene Text Recognition with Masked Vision-Language Pre-training

P Lyu, C Zhang, S Liu, M Qiao, Y Xu, L Wu… - … on Machine Learning … - openreview.net
… In this paper, we explore the utilization of both visual and language priors through pre-training
to enhance text recognition performance. Our approach unifies vision and language pre-…

Traditional to transfer learning progression on scene text detection and recognition: a survey

N Gupta, AS Jalal - Artificial Intelligence Review, 2022 - Springer
scene text reading system ie salient text detection, text or non-text image classification, a
fusion of scene text in vision … , integration of scene text in vision and language, multilingual text

Multi-lingual scene text detection and language identification

S Saha, N Chakraborty, S Kundu, S Paul… - Pattern Recognition …, 2020 - Elsevier
… complexities, image quality, text orientation, text size, etc. The … Most scene text detection
techniques approach the problem … scene text detection, localization and language identification