Much of the text data that exists in many languages is locked away in nondigitized books and documents. This is particularly true in the case of most endangered languages, where …
S Rijhwani, D Rosenblum, M King… - arXiv preprint arXiv …, 2023 - arxiv.org
There has been recent interest in improving optical character recognition (OCR) for endangered languages, particularly because a large number of documents and books in …