MOST: A multi-oriented scene text detector with localization refinement

W Yu, Y Liu, W Hua, D Jiang… - Proceedings of the …, 2023 - openaccess.thecvf.com

The recent large-scale Contrastive Language-Image Pretraining (CLIP) model has shown
great potential in various downstream tasks via leveraging the pretrained vision and …

被引用次数：56 相关文章所有 7 个版本

[PDF] thecvf.com

Towards end-to-end unified scene text detection and layout analysis

S Long, S Qin, D Panteleev… - Proceedings of the …, 2022 - openaccess.thecvf.com

Scene text detection and document layout analysis have long been treated as two separate
tasks in different image domains. In this paper, we bring them together and introduce the …

被引用次数：81 相关文章所有 8 个版本

[PDF] thecvf.com

Estextspotter: Towards better scene text spotting with explicit synergy in transformer

M Huang, J Zhang, D Peng, H Lu… - Proceedings of the …, 2023 - openaccess.thecvf.com

In recent years, end-to-end scene text spotting approaches are evolving to the Transformer-
based framework. While previous studies have shown the crucial importance of the intrinsic …

被引用次数：22 相关文章所有 5 个版本

[PDF] thecvf.com

Few could be better than all: Feature sampling and grouping for scene text detection

J Tang, W Zhang, H Liu, MK Yang… - Proceedings of the …, 2022 - openaccess.thecvf.com

Recently, transformer-based methods have achieved promising progresses in object
detection, as they can eliminate the post-processes like NMS and enrich the deep …

被引用次数：80 相关文章所有 7 个版本

[PDF] arxiv.org

Arbitrary shape text detection via boundary transformer

SX Zhang, C Yang, X Zhu… - IEEE Transactions on …, 2023 - ieeexplore.ieee.org

In arbitrary shape text detection, locating accurate text boundaries is challenging and non-
trivial. Existing methods often suffer from indirect text boundary modeling or complex post …

被引用次数：39 相关文章所有 4 个版本

[PDF] arxiv.org

Language matters: A weakly supervised vision-language pre-training approach for scene text detection and spotting

C Xue, W Zhang, Y Hao, S Lu, PHS Torr… - European Conference on …, 2022 - Springer

Abstract Recently, Vision-Language Pre-training (VLP) techniques have greatly benefited
various vision-language tasks by jointly learning visual and textual representations, which …

被引用次数：33 相关文章所有 7 个版本

[PDF] arxiv.org

Textdiffuser-2: Unleashing the power of language models for text rendering

J Chen, Y Huang, T Lv, L Cui, Q Chen, F Wei - arXiv preprint arXiv …, 2023 - arxiv.org

The diffusion model has been proven a powerful generative model in recent years, yet
remains a challenge in generating visual text. Several methods alleviated this issue by …

被引用次数：23 相关文章所有 2 个版本

[PDF] acm.org

Towards robust real-time scene text detection: From semantic to instance representation learning

X Qin, P Lyu, C Zhang, Y Zhou, K Yao… - Proceedings of the 31st …, 2023 - dl.acm.org

Due to the flexible representation of arbitrary-shaped scene text and simple pipeline, bottom-
up segmentation-based methods begin to be mainstream in real-time scene text detection …

被引用次数：7 相关文章所有 4 个版本

[PDF] thecvf.com

Vision-language pre-training for boosting scene text detectors

S Song, J Wan, Z Yang, J Tang… - Proceedings of the …, 2022 - openaccess.thecvf.com

Recently, vision-language joint representation learning has proven to be highly effective in
various scenarios. In this paper, we specifically adapt vision-language joint learning for …

被引用次数：26 相关文章所有 5 个版本

A survey of text detection and recognition algorithms based on deep learning technology

XF Wang, ZH He, K Wang, YF Wang, L Zou, ZZ Wu - Neurocomputing, 2023 - Elsevier

Abstract Optical Character Recognition (OCR) poses a crucial challenge within the realm of
computer vision research, as it plays a pivotal role in converting vast amounts of …

被引用次数：14 相关文章所有 2 个版本

高级搜索

QQ 群