Pixology: Probing the Linguistic and Visual Capabilities of Pixel-based Language Models

K Tatariya, V Araujo, T Bauwens… - arXiv preprint arXiv …, 2024 - arxiv.org
Pixel-based language models have emerged as a compelling alternative to subword-based
language modelling, particularly because they can represent virtually any script. PIXEL, a …

A Concise Survey of OCR for Low-Resource Languages

M Agarwal, A Anastasopoulos - … of the 4th Workshop on Natural …, 2024 - aclanthology.org
Modern natural language processing (NLP) techniques increasingly require substantial
amounts of data to train robust algorithms. Building such technologies for low-resource …

[PDF][PDF] Digitalisation Workflows in the Age of Transformer Models: A Case Study in Digital Cultural Heritage

M Vafaie, MA Tan, H Sack - 2024 - semdh.github.io
The advent of transformer architecture revolutionised the field of Artificial Intelligence (AI)
and its various applications. It is only recently that digitalisation of cultural heritage data has …