scene text recognition vision language- 学术资源搜索

Textscanner: Reading characters in order for robust scene text recognition

Z Wan, M He, H Chen, X Bai, C Yao - … of the AAAI conference on artificial …, 2020 - aaai.org

… We conduct experiments on public benchmarks for scene text recognition to validate the …
Scene text recognition using higher order language priors. In BMVC-British Machine Vision …

被引用次数：150 相关文章所有 8 个版本

EMU: Effective multi-hot encoding net for lightweight scene text recognition with a large character set

B Li, X Tang, X Qi, Y Chen, CG Li… - IEEE Transactions on …, 2022 - ieeexplore.ieee.org

… modUle (EMU) for scene text recognition in the scenario of multi-languages or languages
with large character set. Specifically, EMU … , object segmentation and vision language model. …

被引用次数：15 相关文章所有 2 个版本

[PDF] arxiv.org

I2C2W: image-to-character-to-word transformers for accurate scene text recognition

C Xue, J Huang, W Zhang, S Lu, C Wang… - arXiv preprint arXiv …, 2021 - arxiv.org

… of natural language processing, most recent scene text recognizers adopt an … of visual
features at noisy decoding time steps. This paper presents I2C2W, a novel scene text recognition …

被引用次数：12 相关文章所有 2 个版本

[PDF] thecvf.com

Latr: Layout-aware transformer for scene-text vqa

AF Biten, R Litman, Y Xie… - … pattern recognition, 2022 - openaccess.thecvf.com

… and iterative language modeling for scene text recognition. In Proceedings of the IEEE/CVF
Conference on Computer Vision and Pattern Recognition, pages 7098–7107, 2021. 1, 6 …

被引用次数：86 相关文章所有 7 个版本

[PDF] thecvf.com

An Empirical Study of Scaling Law for Scene Text Recognition

M Rang, Z Bi, C Liu, Y Wang… - … and Pattern Recognition, 2024 - openaccess.thecvf.com

… narrows its focus to the text recognition phase, specifically to Scene Text Recognition (STR).
STR … Scene text recognition using higher order language priors. BMVC-British machine …

[PDF] arxiv.org

E2e-mlt-an unconstrained end-to-end method for multi-language scene text

M Bušta, Y Patel, J Matas - … : 14th Asian Conference on Computer Vision …, 2019 - Springer

… Scene text recognition finds its use as a component in larger … driving, indoor navigations
and visual search engines. … training multi-language scene text detection, recognition and script …

被引用次数：94 相关文章所有 5 个版本

[PDF] arxiv.org

VIPTR: A Vision Permutable Extractor for Fast and Efficient Scene Text Recognition

X Cheng, W Zhou, X Li, X Chen, J Yang, T Li… - arXiv preprint arXiv …, 2024 - arxiv.org

… that the single-vision model based on the self-attention mechanism can still achieve
comparable accuracy to the high-level vision-language model in the scene text recognition task. At …

被引用次数：1 相关文章所有 2 个版本

Convolutional attention networks for scene text recognition

H Xie, S Fang, ZJ Zha, Y Yang, Y Li… - ACM Transactions on …, 2019 - dl.acm.org

… on standard datasets for scene text recognition, including Street View Text, IIIT5K, and ICDAR
… this article, we show that convolutional-based language modeling for text recognition not …

被引用次数：82 相关文章

[PDF] arxiv.org

Master: Multi-aspect non-local network for scene text recognition

N Lu, W Yu, X Qi, Y Chen, P Gong, R Xiao, X Bai - Pattern Recognition, 2021 - Elsevier

… [13] propose to ensemble attention and language models in an attention-based architecture.
… Images are generated from side-view angle snapshots in Google Street View. Therefore, …

被引用次数：149 相关文章所有 5 个版本

[PDF] aaai.org

Locate then generate: Bridging vision and language with bounding box for scene-text vqa

Y Zhu, Z Liu, Y Liang, X Li, H Liu, C Bao… - Proceedings of the AAAI …, 2023 - ojs.aaai.org

… (b) Scene text recognition mistakes in the STVQA task. … out the correct scene text words, we
design a language refinement network based on a pre-trained language model to distinguish …

被引用次数：4 相关文章所有 6 个版本

高级搜索

QQ 群

Textscanner: Reading characters in order for robust scene text recognition

EMU: Effective multi-hot encoding net for lightweight scene text recognition with a large character set

I2C2W: image-to-character-to-word transformers for accurate scene text recognition

Latr: Layout-aware transformer for scene-text vqa

An Empirical Study of Scaling Law for Scene Text Recognition

E2e-mlt-an unconstrained end-to-end method for multi-language scene text

VIPTR: A Vision Permutable Extractor for Fast and Efficient Scene Text Recognition

Convolutional attention networks for scene text recognition

Master: Multi-aspect non-local network for scene text recognition

Locate then generate: Bridging vision and language with bounding box for scene-text vqa

引用