Participatory research for low-resourced machine translation: A case study in african languages

W Nekoto, V Marivate, T Matsila, T Fasubaa… - arXiv preprint arXiv …, 2020 - arxiv.org
Research in NLP lacks geographic diversity, and the question of how NLP can be scaled to
low-resourced languages has not yet been adequately solved." Low-resourced"-ness is a …

Aya dataset: An open-access collection for multilingual instruction tuning

S Singh, F Vargus, D Dsouza, BF Karlsson… - arXiv preprint arXiv …, 2024 - arxiv.org
Datasets are foundational to many breakthroughs in modern artificial intelligence. Many
recent achievements in the space of natural language processing (NLP) can be attributed to …

Evolving technologies for language learning

R Godwin-Jones - 2021 - scholarspace.manoa.hawaii.edu
This column traces the evolution of electronic resources for language learning over the past
25 years, focusing on the arrival and transformation of the “world wide web”, the dramatic …

Feeling proud, feeling embarrassed: Experiences of low-income women with crowd work

RA Varanasi, D Siddarth, V Seshadri, K Bali… - Proceedings of the …, 2022 - dl.acm.org
Women's economic empowerment is central to gender equality. However, work
opportunities available to low-income women in patriarchal societies are infrequent. While …

Akal Badi ya Bias: An Exploratory Study of Gender Bias in Hindi Language Technology

R Hada, S Husain, V Gumma, H Diddee… - The 2024 ACM …, 2024 - dl.acm.org
Existing research in measuring and mitigating gender bias predominantly centers on
English, overlooking the intricate challenges posed by non-English languages and the …

Predicting the performance of multilingual nlp models

A Srinivasan, S Sitaram, T Ganu, S Dandapat… - arXiv preprint arXiv …, 2021 - arxiv.org
Recent advancements in NLP have given us models like mBERT and XLMR that can serve
over 100 languages. The languages that these models are evaluated on, however, are very …

Assessing digital language support on a global scale

GF Simons, AL Thomas, CK White - arXiv preprint arXiv:2209.13515, 2022 - arxiv.org
The users of endangered languages struggle to thrive in a digitally-mediated world. We
have developed an automated method for assessing how well every language recognized …

Situating Automatic Speech Recognition Development within Communities of Under-heard Language Speakers

T Reitmaier, E Wallington, O Klejch, N Markl… - Proceedings of the …, 2023 - dl.acm.org
In this paper we develop approaches to automatic speech recognition (ASR) development
that suit the needs and functions of under-heard language speakers. Our novel contribution …

[HTML][HTML] Towards language sensitivity and diversity in the digital humanities

PJ Spence, R Brandao - Digital Studies/Le champ numérique, 2021 - digitalstudies.org
Recent years have seen a growing focus on diversity in the digital humanities, and yet there
has been rather less work on geolinguistic diversity, and the research which has been …

Intent identification and entity extraction for healthcare queries in indic languages

A Mullick, I Mondal, S Ray, R Raghav… - arXiv preprint arXiv …, 2023 - arxiv.org
Scarcity of data and technological limitations for resource-poor languages in developing
countries like India poses a threat to the development of sophisticated NLU systems for …