scene text recognition vision language- 学术资源搜索

Language matters: A weakly supervised vision-language pre-training approach for scene text detection and spotting

C Xue, W Zhang, Y Hao, S Lu, PHS Torr… - … on Computer Vision, 2022 - Springer

… We present oCLIP that learns better scene text visual representations by feature alignment
with textual information. As shown in Fig. 2, the proposed network first extracts image …

被引用次数：31 相关文章所有 7 个版本

[PDF] thecvf.com

Clipter: Looking at the bigger picture in scene text recognition

A Aberdam, D Bensaïd, A Golts… - … Computer Vision, 2023 - openaccess.thecvf.com

… In particular, we explore a range of vision and vision-language image encoders, pooling
operators, light-to-heavy fusion schemes, and different integration points between word-level …

被引用次数：11 相关文章所有 8 个版本

[PDF] thecvf.com

From two to one: A new scene text recognizer with visual language modeling network

Y Wang, H Xie, S Fang, J Wang… - … on Computer Vision, 2021 - openaccess.thecvf.com

… vision model with language capability. Specially, we introduce the text recognition of character…
Such operation guides the vision model to use not only the visual texture of characters, but …

被引用次数：143 相关文章所有 6 个版本

[PDF] thecvf.com

Vision-language pre-training for boosting scene text detectors

S Song, J Wan, Z Yang, J Tang… - … Recognition, 2022 - openaccess.thecvf.com

… Recently, vision-language joint representation learning has … adapt vision-language joint
learning for scene text detection, a task … ities: vision and language, since text is the written form of …

被引用次数：20 相关文章所有 5 个版本

[PDF] hal.science

Scene text recognition using higher order language priors

A Mishra, K Alahari, CV Jawahar - BMVC-British machine vision …, 2012 - inria.hal.science

… like character detection and recognition we provide annotated character bounding boxes. …
We address a more general problem of scene text recognition, ie recognizing a word without …

被引用次数：689 相关文章所有 24 个版本

[PDF] arxiv.org

Vision transformer for fast and efficient scene text recognition

R Atienza - … conference on document analysis and recognition, 2021 - Springer

… Scene text recognition (STR) enables computers to read text in natural scenes such as object
labels, road signs and instructions. STR helps machines perform informed decisions such …

被引用次数：141 相关文章所有 5 个版本

[PDF] arxiv.org

Visual attention models for scene text recognition

SK Ghosh, E Valveny… - … analysis and recognition …, 2017 - ieeexplore.ieee.org

… language modeling outperforms the state-ofthe-art in unconstrained scene text recognition
… In this paper we proposed an LSTM-based visual attention model for scene text recognition. …

被引用次数：58 相关文章所有 9 个版本

[PDF] researchgate.net

Attention and language ensemble for scene text recognition with convolutional sequence modeling

S Fang, H Xie, ZJ Zha, N Sun, J Tan… - Proceedings of the 26th …, 2018 - dl.acm.org

… loss from language aspect, multiple losses from attention and language are accumulated
for … on standard datasets for scene text recognition, including Street View Text, IIIT5K and …

被引用次数：69 相关文章所有 2 个版本

[PDF] arxiv.org

Multi-granularity prediction for scene text recognition

P Wang, C Da, C Yao - European Conference on Computer Vision, 2022 - Springer

… language information of text. In order to effectively resort to linguistic information for scene text
recognition… in NLP [7] into text recognition method. Subword tokenization algorithms aim to …

被引用次数：48 相关文章所有 5 个版本

PMMN: pre-trained multi-modal network for scene text recognition

Y Zhang, Z Fu, F Huang, Y Liu - Pattern Recognition Letters, 2021 - Elsevier

… model and language model respectively to learn modality-specific knowledge for … scene
text recognition. In detail, we first pre-train the proposed off-the-shelf vision model and language …

被引用次数：11 相关文章所有 3 个版本

高级搜索

QQ 群

Language matters: A weakly supervised vision-language pre-training approach for scene text detection and spotting

Clipter: Looking at the bigger picture in scene text recognition

From two to one: A new scene text recognizer with visual language modeling network

Vision-language pre-training for boosting scene text detectors

Scene text recognition using higher order language priors

Vision transformer for fast and efficient scene text recognition

Visual attention models for scene text recognition

Attention and language ensemble for scene text recognition with convolutional sequence modeling

Multi-granularity prediction for scene text recognition

PMMN: pre-trained multi-modal network for scene text recognition

相关搜索

引用