Gradient vaccine: Investigating and improving multi-task optimization in massively multilingual models

Z Wang, Y Tsvetkov, O Firat, Y Cao - arXiv preprint arXiv:2010.05874, 2020 - arxiv.org
Massively multilingual models subsuming tens or even hundreds of languages pose great
challenges to multi-task optimization. While it is a common practice to apply a language …

On negative interference in multilingual models: Findings and a meta-learning treatment

Z Wang, ZC Lipton, Y Tsvetkov - arXiv preprint arXiv:2010.03017, 2020 - arxiv.org
Modern multilingual models are trained on concatenated text from multiple languages in
hopes of conferring benefits to each (positive transfer), with the most pronounced benefits …

MulDA: A multilingual data augmentation framework for low-resource cross-lingual NER

L Liu, B Ding, L Bing, S Joty, L Si… - Proceedings of the 59th …, 2021 - aclanthology.org
Abstract Named Entity Recognition (NER) for low-resource languages is a both practical and
challenging research problem. This paper addresses zero-shot transfer for cross-lingual …

A primer on pretrained multilingual language models

S Doddapaneni, G Ramesh, MM Khapra… - arXiv preprint arXiv …, 2021 - arxiv.org
Multilingual Language Models (\MLLMs) such as mBERT, XLM, XLM-R,\textit {etc.} have
emerged as a viable option for bringing the power of pretraining to a large number of …

Explicit alignment objectives for multilingual bidirectional encoders

J Hu, M Johnson, O Firat, A Siddhant… - arXiv preprint arXiv …, 2020 - arxiv.org
Pre-trained cross-lingual encoders such as mBERT (Devlin et al., 2019) and XLMR
(Conneau et al., 2020) have proven to be impressively effective at enabling transfer-learning …

Cross-modal generalization: Learning in low resource modalities via meta-alignment

PP Liang, P Wu, L Ziyin, LP Morency… - Proceedings of the 29th …, 2021 - dl.acm.org
How can we generalize to a new prediction task at test time when it also uses a new
modality as input? More importantly, how can we do this with as little annotated data as …

Cross-lingual alignment methods for multilingual BERT: A comparative study

S Kulshreshtha, JL Redondo-García… - arXiv preprint arXiv …, 2020 - arxiv.org
Multilingual BERT (mBERT) has shown reasonable capability for zero-shot cross-lingual
transfer when fine-tuned on downstream tasks. Since mBERT is not pre-trained with explicit …

Investigating Unsupervised Neural Machine Translation for Low-resource Language Pair English-Mizo via Lexically Enhanced Pre-trained Language Models

C Lalrempuii, B Soni - ACM Transactions on Asian and Low-Resource …, 2023 - dl.acm.org
The vast majority of languages in the world at present are considered to be low-resource
languages. Since the availability of large parallel data is crucial for the success of most …

Model and data transfer for cross-lingual sequence labelling in zero-resource settings

I García-Ferrero, R Agerri, G Rigau - arXiv preprint arXiv:2210.12623, 2022 - arxiv.org
Zero-resource cross-lingual transfer approaches aim to apply supervised models from a
source language to unlabelled target languages. In this paper we perform an in-depth study …

Aligning cross-lingual sentence representations with dual momentum contrast

L Wang, W Zhao, J Liu - arXiv preprint arXiv:2109.00253, 2021 - arxiv.org
In this paper, we propose to align sentence representations from different languages into a
unified embedding space, where semantic similarities (both cross-lingual and monolingual) …