Few-shot parameter-efficient fine-tuning is better and cheaper than in-context learning

H Liu, D Tam, M Muqeeth, J Mohta… - Advances in …, 2022 - proceedings.neurips.cc
Few-shot in-context learning (ICL) enables pre-trained language models to perform a
previously-unseen task without any gradient-based training by feeding a small number of …

Efficient methods for natural language processing: A survey

M Treviso, JU Lee, T Ji, B Aken, Q Cao… - Transactions of the …, 2023 - direct.mit.edu
Recent work in natural language processing (NLP) has yielded appealing results from
scaling model parameters and training data; however, using only scale to improve …

AutoPEFT: Automatic Configuration Search for Parameter-Efficient Fine-Tuning

H Zhou, X Wan, I Vulić, A Korhonen - Transactions of the Association …, 2024 - direct.mit.edu
Large pretrained language models are widely used in downstream NLP tasks via task-
specific fine-tuning, but such procedures can be costly. Recently, Parameter-Efficient Fine …

IAPT: Instance-Aware Prompt Tuning for Large Language Models

W Zhu, A Tian, C Yin, Y Ni, X Wang… - Proceedings of the 62nd …, 2024 - aclanthology.org
Soft prompt tuning is a widely studied parameter-efficient fine-tuning method. However, it
has a clear drawback: many soft tokens must be inserted into the input sequences to …

Learned adapters are better than manually designed adapters

Y Zhang, P Wang, M Tan, W Zhu - Findings of the Association for …, 2023 - aclanthology.org
Recently, a series of works have looked into further improving the adapter-based tuning by
manually designing better adapter architectures. Understandably, these manually designed …

Mera: Merging pretrained adapters for few-shot learning

S He, RZ Fan, L Ding, L Shen, T Zhou… - arXiv preprint arXiv …, 2023 - arxiv.org
Adapter tuning, which updates only a few parameters, has become a mainstream method for
fine-tuning pretrained language models to downstream tasks. However, it often yields …

Sparse structure search for delta tuning

S Hu, Z Zhang, N Ding, Y Wang… - Advances in Neural …, 2022 - proceedings.neurips.cc
Adapting large pre-trained models (PTMs) through fine-tuning imposes prohibitive
computational and storage burdens. Recent studies of delta tuning (DT), ie, parameter …

Alora: Allocating low-rank adaptation for fine-tuning large language models

Z Liu, J Lyn, W Zhu, X Tian, Y Graham - arXiv preprint arXiv:2403.16187, 2024 - arxiv.org
Parameter-efficient fine-tuning (PEFT) is widely studied for its effectiveness and efficiency in
the era of large language models. Low-rank adaptation (LoRA) has demonstrated …

Phylogeny-inspired adaptation of multilingual models to new languages

F Faisal, A Anastasopoulos - arXiv preprint arXiv:2205.09634, 2022 - arxiv.org
Large pretrained multilingual models, trained on dozens of languages, have delivered
promising results due to cross-lingual learning capabilities on variety of language tasks …

A Survey on Transformers in NLP with Focus on Efficiency

W Ansar, S Goswami, A Chakrabarti - arXiv preprint arXiv:2406.16893, 2024 - arxiv.org
The advent of transformers with attention mechanisms and associated pre-trained models
have revolutionized the field of Natural Language Processing (NLP). However, such models …