Scaling & shifting your features: A new baseline for efficient model tuning

D Lian, D Zhou, J Feng, X Wang - Advances in Neural …, 2022 - proceedings.neurips.cc
Existing fine-tuning methods either tune all parameters of the pre-trained model (full fine-
tuning), which is not efficient, or only tune the last linear layer (linear probing), which suffers …

Increlora: Incremental parameter allocation method for parameter-efficient fine-tuning

F Zhang, L Li, J Chen, Z Jiang, B Wang… - arXiv preprint arXiv …, 2023 - arxiv.org
With the increasing size of pre-trained language models (PLMs), fine-tuning all the
parameters in the model is not efficient, especially when there are a large number of …

Improved fine-tuning by better leveraging pre-training data

Z Liu, Y Xu, Y Xu, Q Qian, H Li, X Ji… - Advances in Neural …, 2022 - proceedings.neurips.cc
As a dominant paradigm, fine-tuning a pre-trained model on the target data is widely used in
many deep learning applications, especially for small data sets. However, recent studies …

Rethinking the hyperparameters for fine-tuning

H Li, P Chaudhari, H Yang, M Lam… - arXiv preprint arXiv …, 2020 - arxiv.org
Fine-tuning from pre-trained ImageNet models has become the de-facto standard for various
computer vision tasks. Current practices for fine-tuning typically involve selecting an ad-hoc …

Dora: Weight-decomposed low-rank adaptation

SY Liu, CY Wang, H Yin, P Molchanov… - arXiv preprint arXiv …, 2024 - arxiv.org
Among the widely used parameter-efficient finetuning (PEFT) methods, LoRA and its
variants have gained considerable popularity because of avoiding additional inference …

Stochastic normalization

Z Kou, K You, M Long, J Wang - Advances in Neural …, 2020 - proceedings.neurips.cc
Fine-tuning pre-trained deep networks on a small dataset is an important component in the
deep learning pipeline. A critical problem in fine-tuning is how to avoid over-fitting when …

Prior gradient mask guided pruning-aware fine-tuning

L Cai, Z An, C Yang, Y Yan, Y Xu - … of the AAAI Conference on Artificial …, 2022 - ojs.aaai.org
Abstract We proposed a Prior Gradient Mask Guided Pruning-aware Fine-Tuning (PGMPF)
framework to accelerate deep Convolutional Neural Networks (CNNs). In detail, the …

Sensitivity-aware visual parameter-efficient fine-tuning

H He, J Cai, J Zhang, D Tao… - Proceedings of the …, 2023 - openaccess.thecvf.com
Abstract Visual Parameter-Efficient Fine-Tuning (PEFT) has become a powerful alternative
for full fine-tuning so as to adapt pre-trained vision models to downstream tasks, which only …

AdaMix: Mixture-of-adaptations for parameter-efficient model tuning

Y Wang, S Agarwal, S Mukherjee, X Liu, J Gao… - arXiv preprint arXiv …, 2022 - arxiv.org
Standard fine-tuning of large pre-trained language models (PLMs) for downstream tasks
requires updating hundreds of millions to billions of parameters, and storing a large copy of …

On the effectiveness of parameter-efficient fine-tuning

Z Fu, H Yang, AMC So, W Lam, L Bing… - Proceedings of the AAAI …, 2023 - ojs.aaai.org
Fine-tuning pre-trained models has been ubiquitously proven to be effective in a wide range
of NLP tasks. However, fine-tuning the whole model is parameter inefficient as it always …