E^ 2VPT: An Effective and Efficient Approach for Visual Prompt Tuning

C Han, Q Wang, Y Cui, Z Cao, W Wang, S Qi… - arXiv preprint arXiv …, 2023 - arxiv.org
As the size of transformer-based models continues to grow, fine-tuning these large-scale
pretrained vision models for new tasks has become increasingly parameter-intensive …

Visual prompt tuning

M Jia, L Tang, BC Chen, C Cardie, S Belongie… - … on Computer Vision, 2022 - Springer
The current modus operandi in adapting pre-trained models involves updating all the
backbone parameters, ie., full fine-tuning. This paper introduces Visual Prompt Tuning (VPT) …

Unleashing the power of visual prompting at the pixel level

J Wu, X Li, C Wei, H Wang, A Yuille, Y Zhou… - arXiv preprint arXiv …, 2022 - arxiv.org
This paper presents a simple and effective visual prompting method for adapting pre-trained
models to downstream recognition tasks. Our method includes two key designs. First, rather …

Prompt Generation Networks for Input-based Adaptation of Frozen Vision Transformers

J Loedeman, MC Stol, T Han, YM Asano - arXiv preprint arXiv:2210.06466, 2022 - arxiv.org
With the introduction of the transformer architecture in computer vision, increasing model
scale has been demonstrated as a clear path to achieving performance and robustness …

Pro-tuning: Unified prompt tuning for vision tasks

X Nie, B Ni, J Chang, G Meng, C Huo… - … on Circuits and …, 2023 - ieeexplore.ieee.org
In computer vision, fine-tuning is the de-facto approach to leverage pre-trained vision
models to perform downstream tasks. However, deploying it in practice is quite challenging …

Lion: Implicit vision prompt tuning

H Wang, J Chang, Y Zhai, X Luo, J Sun, Z Lin… - Proceedings of the …, 2024 - ojs.aaai.org
Despite recent promising performances across a range of vision tasks, vision Transformers
still have an issue of high computational costs. Recently, vision prompt learning has …

Learning expressive prompting with residuals for vision transformers

R Das, Y Dukler, A Ravichandran… - Proceedings of the …, 2023 - openaccess.thecvf.com
Prompt learning is an efficient approach to adapt transformers by inserting learnable set of
parameters into the input and intermediate representations of a pre-trained model. In this …

Exploring visual prompts for adapting large-scale models

H Bahng, A Jahanian, S Sankaranarayanan… - arXiv preprint arXiv …, 2022 - arxiv.org
We investigate the efficacy of visual prompting to adapt large-scale models in vision.
Following the recent approach from prompt tuning and adversarial reprogramming, we learn …

Visual query tuning: Towards effective usage of intermediate representations for parameter and memory efficient transfer learning

CH Tu, Z Mai, WL Chao - … of the IEEE/CVF Conference on …, 2023 - openaccess.thecvf.com
Intermediate features of a pre-trained model have been shown informative for making
accurate predictions on downstream tasks, even if the model backbone is frozen. The key …

Parameter-efficient fine-tuning for pre-trained vision models: A survey

Y Xin, S Luo, H Zhou, J Du, X Liu, Y Fan, Q Li… - arXiv preprint arXiv …, 2024 - arxiv.org
Large-scale pre-trained vision models (PVMs) have shown great potential for adaptability
across various downstream vision tasks. However, with state-of-the-art PVMs growing to …