Neural machine translation for low-resource languages: A survey

S Ranathunga, ESA Lee, M Prifti Skenduli… - ACM Computing …, 2023 - dl.acm.org
Neural Machine Translation (NMT) has seen tremendous growth in the last ten years since
the early 2000s and has already entered a mature phase. While considered the most widely …

Meta learning for natural language processing: A survey

H Lee, SW Li, NT Vu - arXiv preprint arXiv:2205.01500, 2022 - arxiv.org
Deep learning has been the mainstream technique in natural language processing (NLP)
area. However, the techniques require many labeled data and are less generalizable across …

Beyond english-centric multilingual machine translation

A Fan, S Bhosale, H Schwenk, Z Ma, A El-Kishky… - Journal of Machine …, 2021 - jmlr.org
Existing work in translation demonstrated the potential of massively multilingual machine
translation by training a single model able to translate between any pair of languages …

A pretrainer's guide to training data: Measuring the effects of data age, domain coverage, quality, & toxicity

S Longpre, G Yauney, E Reif, K Lee, A Roberts… - arXiv preprint arXiv …, 2023 - arxiv.org
Pretraining is the preliminary and fundamental step in developing capable language models
(LM). Despite this, pretraining data design is critically under-documented and often guided …

Participatory research for low-resourced machine translation: A case study in african languages

W Nekoto, V Marivate, T Matsila, T Fasubaa… - arXiv preprint arXiv …, 2020 - arxiv.org
Research in NLP lacks geographic diversity, and the question of how NLP can be scaled to
low-resourced languages has not yet been adequately solved." Low-resourced"-ness is a …

Uni-perceiver-moe: Learning sparse generalist models with conditional moes

J Zhu, X Zhu, W Wang, X Wang, H Li… - Advances in Neural …, 2022 - proceedings.neurips.cc
To build an artificial neural network like the biological intelligence system, recent works have
unified numerous tasks into a generalist model, which can process various tasks with shared …

On negative interference in multilingual models: Findings and a meta-learning treatment

Z Wang, ZC Lipton, Y Tsvetkov - arXiv preprint arXiv:2010.03017, 2020 - arxiv.org
Modern multilingual models are trained on concatenated text from multiple languages in
hopes of conferring benefits to each (positive transfer), with the most pronounced benefits …

Aya model: An instruction finetuned open-access multilingual language model

A Üstün, V Aryabumi, ZX Yong, WY Ko… - arXiv preprint arXiv …, 2024 - arxiv.org
Recent breakthroughs in large language models (LLMs) have centered around a handful of
data-rich languages. What does it take to broaden access to breakthroughs beyond first …

Scaling end-to-end models for large-scale multilingual asr

B Li, R Pang, TN Sainath, A Gulati… - 2021 IEEE Automatic …, 2021 - ieeexplore.ieee.org
Building ASR models across many languages is a challenging multi-task learning problem
due to large variations and heavily unbalanced data. Existing work has shown positive …

Scaling laws for multilingual neural machine translation

P Fernandes, B Ghorbani, X Garcia… - International …, 2023 - proceedings.mlr.press
In this work, we provide a large-scale empirical study of the scaling properties of multilingual
neural machine translation models. We examine how increases in the model size affect the …