Parameter-efficient fine-tuning for large models: A comprehensive survey

Z Han, C Gao, J Liu, SQ Zhang - arXiv preprint arXiv:2403.14608, 2024 - arxiv.org
Large models represent a groundbreaking advancement in multiple application fields,
enabling remarkable achievements across various tasks. However, their unprecedented …

Visual tuning

BXB Yu, J Chang, H Wang, L Liu, S Wang… - ACM Computing …, 2024 - dl.acm.org
Fine-tuning visual models has been widely shown promising performance on many
downstream visual tasks. With the surprising development of pre-trained visual foundation …

Llama-adapter v2: Parameter-efficient visual instruction model

P Gao, J Han, R Zhang, Z Lin, S Geng, A Zhou… - arXiv preprint arXiv …, 2023 - arxiv.org
How to efficiently transform large language models (LLMs) into instruction followers is
recently a popular research direction, while training LLM for multi-modal reasoning remains …

Deep class-incremental learning: A survey

DW Zhou, QW Wang, ZH Qi, HJ Ye, DC Zhan… - arXiv preprint arXiv …, 2023 - arxiv.org
Deep models, eg, CNNs and Vision Transformers, have achieved impressive achievements
in many vision tasks in the closed world. However, novel classes emerge from time to time in …

Clip-adapter: Better vision-language models with feature adapters

P Gao, S Geng, R Zhang, T Ma, R Fang… - International Journal of …, 2024 - Springer
Large-scale contrastive vision-language pretraining has shown significant progress in visual
representation learning. Unlike traditional visual systems trained by a fixed set of discrete …

Vpgtrans: Transfer visual prompt generator across llms

A Zhang, H Fei, Y Yao, W Ji, L Li… - Advances in Neural …, 2024 - proceedings.neurips.cc
Since developing a new multimodal LLM (MLLM) by pre-training on tremendous image-text
pairs from scratch can be exceedingly resource-consuming, connecting an existing LLM with …

Graphadapter: Tuning vision-language models with dual knowledge graph

X Li, D Lian, Z Lu, J Bai, Z Chen… - Advances in Neural …, 2024 - proceedings.neurips.cc
Adapter-style efficient transfer learning (ETL) has shown excellent performance in the tuning
of vision-language models (VLMs) under the low-data regime, where only a few additional …

Ranpac: Random projections and pre-trained models for continual learning

MD McDonnell, D Gong, A Parvaneh… - Advances in …, 2024 - proceedings.neurips.cc
Continual learning (CL) aims to incrementally learn different tasks (such as classification) in
a non-stationary data stream without forgetting old ones. Most CL works focus on tackling …

Fact: Factor-tuning for lightweight adaptation on vision transformer

S Jie, ZH Deng - Proceedings of the AAAI conference on artificial …, 2023 - ojs.aaai.org
Recent work has explored the potential to adapt a pre-trained vision transformer (ViT) by
updating only a few parameters so as to improve storage efficiency, called parameter …

Difffit: Unlocking transferability of large diffusion models via simple parameter-efficient fine-tuning

E Xie, L Yao, H Shi, Z Liu, D Zhou… - Proceedings of the …, 2023 - openaccess.thecvf.com
Diffusion models have proven to be highly effective in generating high-quality images.
However, adapting large pre-trained diffusion models to new domains remains an open …