A Practical Guide to Fine-tuning Language Models with Limited Data

M Szép, D Rueckert, R von Eisenhart-Rothe… - arXiv preprint arXiv …, 2024 - arxiv.org
Employing pre-trained Large Language Models (LLMs) has become the de facto standard in
Natural Language Processing (NLP) despite their extensive data requirements. Motivated by …

Layer swapping for zero-shot cross-lingual transfer in large language models

L Bandarkar, B Muller, P Yuvraj, R Hou… - arXiv preprint arXiv …, 2024 - arxiv.org
Model merging, such as model souping, is the practice of combining different models with
the same architecture together without further training. In this work, we present a model …

How Transliterations Improve Crosslingual Alignment

Y Liu, M Wang, AH Kargaran, A Imani, O Xhelili… - arXiv preprint arXiv …, 2024 - arxiv.org
Recent studies have shown that post-aligning multilingual pretrained language models
(mPLMs) using alignment objectives on both original and transliterated data can improve …

AdaMergeX: Cross-Lingual Transfer with Large Language Models via Adaptive Adapter Merging

Y Zhao, W Zhang, H Wang, K Kawaguchi… - arXiv preprint arXiv …, 2024 - arxiv.org
As an effective alternative to the direct fine-tuning on target tasks in specific languages,
cross-lingual transfer addresses the challenges of limited training data by decoupling''task …

Predicting Machine Translation Performance on Low-Resource Languages: The Role of Domain Similarity

E Khiu, H Toossi, D Anugraha, J Liu, J Li… - arXiv preprint arXiv …, 2024 - arxiv.org
Fine-tuning and testing a multilingual large language model is expensive and challenging
for low-resource languages (LRLs). While previous studies have predicted the performance …

Cross-Lingual Transfer with Large Language Models via Adaptive Adapter Merging

Y Zhao, W Zhang, H Wang, K Kawaguchi, L Bing - openreview.net
As an effective alternative to the direct fine-tuning on target tasks in specific languages,
cross-lingual transfer addresses the challenges of limited training data by aligning …