J Wei, H Zhan, Y Lu, X Tu, B Yin, C Liu… - Proceedings of the AAAI …, 2024 - ojs.aaai.org
… How to effectively and jointly model vision and language to … vision-language reasoning in scenetextrecognition, we present a balanced, unified and synchronized vision-language …
X Yang, W Silamu, M Xu, Y Li - Sensors, 2023 - mdpi.com
… help understand visual cues in scenetextrecognition. Our model designs a masked language … masked language model, we build a semantic visual interaction module to help the model …
… We have presented a hierarchical visual-semantic interaction (HVSI) … visual information for scenetextrecognition, which does not require a pre-training process or external language …
… a graph-based context reasoning model that supplements the language model to exploit both visual spatial context and linguistic context to improve the visualrecognition results. …
Q Jiang, J Wang, D Peng, C Liu… - … on computer vision, 2023 - openaccess.thecvf.com
… From Two to One: A new scenetext recognizer with visuallanguage modeling network. In CVPR, pages 14194–14203, 2021. 1, 4, 7 [58] Xudong Xie, Ling Fu, Zhifei Zhang, Zhaowen …
H Lin, P Yang, F Zhang - Archives of computational methods in …, 2020 - Springer
… , scenetextdetection is a challenging problem. Similar to majority of computer vision tasks, most previous textdetection … With textrecognition, techniques related to language model and …
P Lyu, C Zhang, S Liu, M Qiao, Y Xu, L Wu… - … on Machine Learning … - openreview.net
… In this paper, we explore the utilization of both visual and language priors through pre-training to enhance textrecognition performance. Our approach unifies vision and language pre-…
N Gupta, AS Jalal - Artificial Intelligence Review, 2022 - Springer
… scenetext reading system ie salient textdetection, text or non-text image classification, a fusion of scenetext in vision … , integration of scenetext in vision and language, multilingual text …
… complexities, image quality, text orientation, text size, etc. The … Most scenetextdetection techniques approach the problem … scenetextdetection, localization and languageidentification …