Latr: Layout-aware transformer for scene-text vqa

AF Biten, R Litman, Y Xie… - Proceedings of the …, 2022 - openaccess.thecvf.com
We propose a novel multimodal architecture for Scene Text Visual Question Answering
(STVQA), named Layout-Aware Transformer (LaTr). The task of STVQA requires models to …

Revisiting scene text recognition: A data perspective

Q Jiang, J Wang, D Peng, C Liu… - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com
This paper aims to re-assess scene text recognition (STR) from a data-oriented perspective.
We begin by revisiting the six commonly used benchmarks in STR and observe a trend of …

Knowledge graph contrastive learning based on relation-symmetrical structure

K Liang, Y Liu, S Zhou, W Tu, Y Wen… - … on Knowledge and …, 2023 - ieeexplore.ieee.org
Knowledge graph embedding (KGE) aims at learning powerful representations to benefit
various artificial intelligence applications. Meanwhile, contrastive learning has been widely …

Reading and writing: Discriminative and generative modeling for self-supervised text recognition

M Yang, M Liao, P Lu, J Wang, S Zhu, H Luo… - Proceedings of the 30th …, 2022 - dl.acm.org
Existing text recognition methods usually need large-scale training data. Most of them rely
on synthetic training data due to the lack of annotated real images. However, there is a …

Dual temperature helps contrastive learning without many negative samples: Towards understanding and simplifying moco

C Zhang, K Zhang, TX Pham, A Niu… - Proceedings of the …, 2022 - openaccess.thecvf.com
Contrastive learning (CL) is widely known to require many negative samples, 65536 in
MoCo for instance, for which the performance of a dictionary-free framework is often inferior …

Context-based contrastive learning for scene text recognition

X Zhang, B Zhu, X Yao, Q Sun, R Li, B Yu - Proceedings of the AAAI …, 2022 - ojs.aaai.org
Pursuing accurate and robust recognizers has been a long-lasting goal for scene text
recognition (STR) researchers. Recently, attention-based methods have demonstrated their …

Neighbor contrastive learning on learnable graph augmentation

X Shen, D Sun, S Pan, X Zhou, LT Yang - Proceedings of the AAAI …, 2023 - ojs.aaai.org
Recent years, graph contrastive learning (GCL), which aims to learn representations from
unlabeled graphs, has made great progress. However, the existing GCL methods mostly …

Perceiving stroke-semantic context: Hierarchical contrastive learning for robust scene text recognition

H Liu, B Wang, Z Bao, M Xue, S Kang, D Jiang… - Proceedings of the …, 2022 - ojs.aaai.org
Abstract We introduce Perceiving Stroke-Semantic Context (PerSec), a new approach to self-
supervised representation learning tailored for Scene Text Recognition (STR) task …

Maskocr: Text recognition with masked encoder-decoder pretraining

P Lyu, C Zhang, S Liu, M Qiao, Y Xu, L Wu… - arXiv preprint arXiv …, 2022 - arxiv.org
Text images contain both visual and linguistic information. However, existing pre-training
techniques for text recognition mainly focus on either visual representation learning or …

Multi-relational contrastive learning for recommendation

W Wei, L Xia, C Huang - Proceedings of the 17th ACM Conference on …, 2023 - dl.acm.org
Personalized recommender systems play a crucial role in capturing users' evolving
preferences over time to provide accurate and effective recommendations on various online …