Read like humans: Autonomous, bidirectional and iterative language modeling for scene text recognition

S Fang, H Xie, Y Wang, Z Mao… - Proceedings of the …, 2021 - openaccess.thecvf.com
Linguistic knowledge is of great benefit to scene text recognition. However, how to effectively
model linguistic rules in end-to-end deep networks remains a research challenge. In this …

From two to one: A new scene text recognizer with visual language modeling network

Y Wang, H Xie, S Fang, J Wang… - Proceedings of the …, 2021 - openaccess.thecvf.com
In this paper, we abandon the dominant complex language model and rethink the linguistic
learning process in the scene text recognition. Different from previous methods considering …

Towards accurate scene text recognition with semantic reasoning networks

D Yu, X Li, C Zhang, T Liu, J Han… - Proceedings of the …, 2020 - openaccess.thecvf.com
Scene text image contains two levels of contents: visual texture and semantic information.
Although the previous scene text recognition methods have made great progress over the …

Petr: Rethinking the capability of transformer-based language model in scene text recognition

Y Wang, H Xie, S Fang, M Xing, J Wang… - … on Image Processing, 2022 - ieeexplore.ieee.org
The exploration of linguistic information promotes the development of scene text recognition
task. Benefiting from the significance in parallel reasoning and global relationship capture …

Towards video text visual question answering: Benchmark and baseline

M Zhao, B Li, J Wang, W Li, W Zhou… - Advances in …, 2022 - proceedings.neurips.cc
There are already some text-based visual question answering (TextVQA) benchmarks for
developing machine's ability to answer questions based on texts in images in recent years …

Scene text recognition with sliding convolutional character models

F Yin, YC Wu, XY Zhang, CL Liu - arXiv preprint arXiv:1709.01727, 2017 - arxiv.org
Scene text recognition has attracted great interests from the computer vision and pattern
recognition community in recent years. State-of-the-art methods use concolutional neural …

Roadtext-1k: Text detection & recognition dataset for driving videos

S Reddy, M Mathew, L Gomez… - … on Robotics and …, 2020 - ieeexplore.ieee.org
Perceiving text is crucial to understand semantics of outdoor scenes and hence is a critical
requirement to build intelligent systems for driver assistance and self-driving. Most of the …

A bilingual, openworld video text dataset and end-to-end video text spotter with transformer

W Wu, Y Cai, D Zhang, S Wang, Z Li, J Li… - arXiv preprint arXiv …, 2021 - arxiv.org
Most existing video text spotting benchmarks focus on evaluating a single language and
scenario with limited data. In this work, we introduce a large-scale, Bilingual, Open World …

A novel text structure feature extractor for Chinese scene text detection and recognition

X Ren, Y Zhou, Z Huang, J Sun, X Yang, K Chen - IEEE Access, 2017 - ieeexplore.ieee.org
Scene text information extraction plays an important role in many computer vision
applications. Most features in existing text extraction algorithms are only applicable to one …

DSText V2: A comprehensive video text spotting dataset for dense and small text

W Wu, Y Zhang, Y He, L Zhang, Z Lou, H Zhou, X Bai - Pattern Recognition, 2024 - Elsevier
Recently, video text detection, tracking, and recognition in natural scenes are becoming very
popular in the computer vision community. However, most existing algorithms and …