scene text recognition vision language- 学术资源搜索

Prestu: Pre-training for scene-text understanding

J Kil, S Changpinyo, X Chen, H Hu… - … Computer Vision, 2023 - openaccess.thecvf.com

… Inspired by the impressive performance of the visual language modeling pre-training … ,
focus on learning both scene-text recognition and the role of scene-text in its visual context. The …

被引用次数：21 相关文章所有 9 个版本

[PDF] thecvf.com

Scene text visual question answering

AF Biten, R Tito, A Mafla, L Gomez… - … on computer vision, 2019 - openaccess.thecvf.com

… poral Classification have also been widely used in scene text recognition, in works such as
[47, … Question Answering (VQA) aims to come up with an answer to a given natural language …

被引用次数：289 相关文章所有 13 个版本

[PDF] cv-foundation.org

[PDF][PDF] Strokelets: A Learned Multi-Scale Representation for Scene Text Recognition

X Bai, C Yao, W Liu - … on computer vision and pattern recognition, 2014 - cv-foundation.org

… text recognition in natural scenes (aka scene text recognition) … characteristics of the
corresponding languages. For example, … could learn a hybrid set of strokelets on multiple …

被引用次数：108 相关文章所有 7 个版本

[PDF] thecvf.com

OTE: Exploring Accurate Scene Text Recognition Using One Token

J Xu, Y Wang, H Xie, Y Zhang - … and Pattern Recognition, 2024 - openaccess.thecvf.com

… a sequence of visual tokens to represent scene text images, … the entire text image and
achieve accurate text recognition. … and iterative language modeling for scene text recognition. …

[PDF] aaai.org

Perceiving stroke-semantic context: Hierarchical contrastive learning for robust scene text recognition

H Liu, B Wang, Z Bao, M Xue, S Kang, D Jiang… - Proceedings of the …, 2022 - ojs.aaai.org

… for Scene Text Recognition (STR) task. Considering scene text images carry both visual and
… from translation by recognizing foreign languages to street sign recognition for autonomous. …

被引用次数：35 相关文章所有 4 个版本

[PDF] thecvf.com

Estextspotter: Towards better scene text spotting with explicit synergy in transformer

M Huang, J Zhang, D Peng, H Lu… - … Computer Vision, 2023 - openaccess.thecvf.com

… a vision-language communication module designed to enhance explicit synergy, which
utilizes a novel collaborative cross-modal interaction between text detection and recognition. …

被引用次数：14 相关文章所有 5 个版本

[PDF] thecvf.com

Towards accurate scene text recognition with semantic reasoning networks

D Yu, X Li, C Zhang, T Liu, J Han… - … pattern recognition, 2020 - openaccess.thecvf.com

… In addition, transformer has been proved to be effective in many tasks of computer vision [11,
36] and natural language processing [34]. In this paper, we not only adopt transformer to …

被引用次数：341 相关文章所有 6 个版本

[PDF] thecvf.com

Knowledge mining with scene text for fine-grained recognition

H Wang, J Liao, T Cheng, Z Gao… - … Recognition, 2022 - openaccess.thecvf.com

… ] as a knowledge-aware language model and apply them to extract … vision-language tasks,
they require the annotation of image-… Scene text retrieval via joint text detection and similarity …

被引用次数：12 相关文章所有 7 个版本

[PDF] mpg.de

[PDF][PDF] Visual-Semantic Transformer for Scene Text Recognition.

L Diao, X Tang, J Wang, R Fang, G Xie, W Chen - BMVC, 2022 - bmvc2022.mpi-inf.mpg.de

… demonstrate that VST can achieve higher or competitive prediction accuracy in scene
text recognition without the aid of explicit language models. 2 Visual-Semantic Transformer …

被引用次数：2 相关文章所有 2 个版本

[PDF] thecvf.com

On vocabulary reliance in scene text recognition

Z Wan, J Zhang, L Zhang, J Luo… - … Pattern Recognition, 2020 - openaccess.thecvf.com

… of different algorithms in learning language prior. Meanwhile, … for developing scene text
recognition algorithms in the future. … Segmentation-based methods can accurately extract visual …

被引用次数：64 相关文章所有 7 个版本

高级搜索

QQ 群

Prestu: Pre-training for scene-text understanding

Scene text visual question answering

[PDF][PDF] Strokelets: A Learned Multi-Scale Representation for Scene Text Recognition

OTE: Exploring Accurate Scene Text Recognition Using One Token

Perceiving stroke-semantic context: Hierarchical contrastive learning for robust scene text recognition

Estextspotter: Towards better scene text spotting with explicit synergy in transformer

Towards accurate scene text recognition with semantic reasoning networks

Knowledge mining with scene text for fine-grained recognition

[PDF][PDF] Visual-Semantic Transformer for Scene Text Recognition.

On vocabulary reliance in scene text recognition

引用