A concise survey of OCR for low-resource languages

M Agarwal, A Anastasopoulos - … of the 4th Workshop on Natural …, 2024 - aclanthology.org
Modern natural language processing (NLP) techniques increasingly require substantial
amounts of data to train robust algorithms. Building such technologies for low-resource …

[PDF][PDF] Improving Optical Character Recognition for Endangered Languages

S Rijhwani - 2023 - kilthub.cmu.edu
Much of the text data that exists in many languages is locked away in nondigitized books
and documents. This is particularly true in the case of most endangered languages, where …

User-Centric Evaluation of OCR Systems for Kwak'wala

S Rijhwani, D Rosenblum, M King… - arXiv preprint arXiv …, 2023 - arxiv.org
There has been recent interest in improving optical character recognition (OCR) for
endangered languages, particularly because a large number of documents and books in …