Natural Language Processing RELIES on Linguistics

J Opitz, S Wein, N Schneider - arXiv preprint arXiv:2405.05966, 2024 - arxiv.org
Large Language Models (LLMs) have become capable of generating highly fluent text in
certain languages, without modules specially designed to capture grammar or semantic …

Can we teach language models to gloss endangered languages?

M Ginn, M Hulden, A Palmer - arXiv preprint arXiv:2406.18895, 2024 - arxiv.org
Interlinear glossed text (IGT) is a popular format in language documentation projects, where
each morpheme is labeled with a descriptive annotation. Automating the creation of …

Shortcomings of LLMs for Low-Resource Translation: Retrieval and Understanding are Both the Problem

S Court, M Elsner - arXiv preprint arXiv:2406.15625, 2024 - arxiv.org
This work investigates the in-context learning abilities of pretrained large language models
(LLMs) when instructed to translate text from a low-resource language into a high-resource …

NLP for Language Documentation: Two Reasons for the Gap between Theory and Practice

L Gessler, K Von Der Wense - … of the 4th Workshop on Natural …, 2024 - aclanthology.org
Both NLP researchers and linguists have expressed a desire to use language technologies
in language documentation, but most documentary work still proceeds without them …

Computational Language Documentation: Designing a Modular Annotation and Data Management Tool for Cross-cultural Applicability

A O'Neil, D Swanson, S Chelliah - … of the 2nd Workshop on Cross …, 2024 - aclanthology.org
While developing computational language documentation tools, researchers must center the
role of language communities in the process by carefully reflecting on and designing tools to …