Balancing training for multilingual neural machine translation

S Ranathunga, ESA Lee, M Prifti Skenduli… - ACM Computing …, 2023 - dl.acm.org

Neural Machine Translation (NMT) has seen tremendous growth in the last ten years since
the early 2000s and has already entered a mature phase. While considered the most widely …

被引用次数：208 相关文章所有 6 个版本

[PDF] arxiv.org

Meta learning for natural language processing: A survey

H Lee, SW Li, NT Vu - arXiv preprint arXiv:2205.01500, 2022 - arxiv.org

Deep learning has been the mainstream technique in natural language processing (NLP)
area. However, the techniques require many labeled data and are less generalizable across …

被引用次数：48 相关文章所有 5 个版本

[PDF] jmlr.org

Beyond english-centric multilingual machine translation

A Fan, S Bhosale, H Schwenk, Z Ma, A El-Kishky… - Journal of Machine …, 2021 - jmlr.org

Existing work in translation demonstrated the potential of massively multilingual machine
translation by training a single model able to translate between any pair of languages …

被引用次数：719 相关文章所有 9 个版本

[PDF] arxiv.org

A pretrainer's guide to training data: Measuring the effects of data age, domain coverage, quality, & toxicity

S Longpre, G Yauney, E Reif, K Lee, A Roberts… - arXiv preprint arXiv …, 2023 - arxiv.org

Pretraining is the preliminary and fundamental step in developing capable language models
(LM). Despite this, pretraining data design is critically under-documented and often guided …

被引用次数：69 相关文章所有 3 个版本

[PDF] arxiv.org

Participatory research for low-resourced machine translation: A case study in african languages

W Nekoto, V Marivate, T Matsila, T Fasubaa… - arXiv preprint arXiv …, 2020 - arxiv.org

Research in NLP lacks geographic diversity, and the question of how NLP can be scaled to
low-resourced languages has not yet been adequately solved." Low-resourced"-ness is a …

被引用次数：158 相关文章所有 10 个版本

[PDF] neurips.cc

Uni-perceiver-moe: Learning sparse generalist models with conditional moes

J Zhu, X Zhu, W Wang, X Wang, H Li… - Advances in Neural …, 2022 - proceedings.neurips.cc

To build an artificial neural network like the biological intelligence system, recent works have
unified numerous tasks into a generalist model, which can process various tasks with shared …

被引用次数：51 相关文章所有 7 个版本

[PDF] arxiv.org

On negative interference in multilingual models: Findings and a meta-learning treatment

Z Wang, ZC Lipton, Y Tsvetkov - arXiv preprint arXiv:2010.03017, 2020 - arxiv.org

Modern multilingual models are trained on concatenated text from multiple languages in
hopes of conferring benefits to each (positive transfer), with the most pronounced benefits …

被引用次数：119 相关文章所有 4 个版本

[PDF] arxiv.org

Aya model: An instruction finetuned open-access multilingual language model

A Üstün, V Aryabumi, ZX Yong, WY Ko… - arXiv preprint arXiv …, 2024 - arxiv.org

Recent breakthroughs in large language models (LLMs) have centered around a handful of
data-rich languages. What does it take to broaden access to breakthroughs beyond first …

被引用次数：45 相关文章所有 3 个版本

[PDF] arxiv.org

Scaling end-to-end models for large-scale multilingual asr

B Li, R Pang, TN Sainath, A Gulati… - 2021 IEEE Automatic …, 2021 - ieeexplore.ieee.org

Building ASR models across many languages is a challenging multi-task learning problem
due to large variations and heavily unbalanced data. Existing work has shown positive …

被引用次数：72 相关文章所有 4 个版本

[PDF] mlr.press

Scaling laws for multilingual neural machine translation

P Fernandes, B Ghorbani, X Garcia… - International …, 2023 - proceedings.mlr.press

In this work, we provide a large-scale empirical study of the scaling properties of multilingual
neural machine translation models. We examine how increases in the model size affect the …

被引用次数：17 相关文章所有 6 个版本

高级搜索

QQ 群