Visual tuning

BXB Yu, J Chang, H Wang, L Liu, S Wang… - ACM Computing …, 2024 - dl.acm.org
Fine-tuning visual models has been widely shown promising performance on many
downstream visual tasks. With the surprising development of pre-trained visual foundation …

Revisiting class-incremental learning with pre-trained models: Generalizability and adaptivity are all you need

DW Zhou, ZW Cai, HJ Ye, DC Zhan, Z Liu - arXiv preprint arXiv …, 2023 - arxiv.org
Class-incremental learning (CIL) aims to adapt to emerging new classes without forgetting
old ones. Traditional CIL models are trained from scratch to continually acquire knowledge …

Ma-sam: Modality-agnostic sam adaptation for 3d medical image segmentation

C Chen, J Miao, D Wu, A Zhong, Z Yan, S Kim… - Medical Image …, 2024 - Elsevier
Abstract The Segment Anything Model (SAM), a foundation model for general image
segmentation, has demonstrated impressive zero-shot performance across numerous …

Revisiting class-incremental learning with pre-trained models: Generalizability and adaptivity are all you need

DW Zhou, ZW Cai, HJ Ye, DC Zhan, Z Liu - International Journal of …, 2024 - Springer
Class-incremental learning (CIL) aims to adapt to emerging new classes without forgetting
old ones. Traditional CIL models are trained from scratch to continually acquire knowledge …

Convolutional bypasses are better vision transformer adapters

S Jie, ZH Deng, S Chen, Z Jin - ECAI 2024, 2024 - ebooks.iospress.nl
The pretrain-then-finetune paradigm has been widely adopted in computer vision. But as the
size of Vision Transformer (ViT) grows exponentially, the full finetuning becomes prohibitive …

One-for-all: Generalized lora for parameter-efficient fine-tuning

A Chavan, Z Liu, D Gupta, E Xing, Z Shen - arXiv preprint arXiv …, 2023 - arxiv.org
We present Generalized LoRA (GLoRA), an advanced approach for universal parameter-
efficient fine-tuning tasks. Enhancing Low-Rank Adaptation (LoRA), GLoRA employs a …

Revisiting the parameter efficiency of adapters from the perspective of precision redundancy

S Jie, H Wang, ZH Deng - Proceedings of the IEEE/CVF …, 2023 - openaccess.thecvf.com
Current state-of-the-art results in computer vision depend in part on fine-tuning large pre-
trained vision models. However, with the exponential growth of model sizes, the …

Mind the interference: Retaining pre-trained knowledge in parameter efficient continual learning of vision-language models

L Tang, Z Tian, K Li, C He, H Zhou, H Zhao, X Li… - … on Computer Vision, 2025 - Springer
This study addresses the Domain-Class Incremental Learning problem, a realistic but
challenging continual learning scenario where both the domain distribution and target …

Few-shot adaptation of multi-modal foundation models: A survey

F Liu, T Zhang, W Dai, C Zhang, W Cai, X Zhou… - Artificial Intelligence …, 2024 - Springer
Abstract Multi-modal (vision-language) models, such as CLIP, are replacing traditional
supervised pre-training models (eg, ImageNet-based pre-training) as the new generation of …

Pyra: Parallel yielding re-activation for training-inference efficient task adaptation

Y Xiong, H Chen, T Hao, Z Lin, J Han, Y Zhang… - … on Computer Vision, 2025 - Springer
Recently, the scale of transformers has grown rapidly, which introduces considerable
challenges in terms of training overhead and inference efficiency in the scope of task …