Scaling & shifting your features: A new baseline for efficient model tuning

Z Han, C Gao, J Liu, SQ Zhang - arXiv preprint arXiv:2403.14608, 2024 - arxiv.org

Large models represent a groundbreaking advancement in multiple application fields,
enabling remarkable achievements across various tasks. However, their unprecedented …

被引用次数：37 相关文章所有 2 个版本

[PDF] acm.org

Visual tuning

BXB Yu, J Chang, H Wang, L Liu, S Wang… - ACM Computing …, 2023 - dl.acm.org

Fine-tuning visual models has been widely shown promising performance on many
downstream visual tasks. With the surprising development of pre-trained visual foundation …

被引用次数：18 相关文章所有 4 个版本

[PDF] arxiv.org

Llama-adapter v2: Parameter-efficient visual instruction model

P Gao, J Han, R Zhang, Z Lin, S Geng, A Zhou… - arXiv preprint arXiv …, 2023 - arxiv.org

How to efficiently transform large language models (LLMs) into instruction followers is
recently a popular research direction, while training LLM for multi-modal reasoning remains …

被引用次数：347 相关文章所有 3 个版本

[PDF] arxiv.org

Clip-adapter: Better vision-language models with feature adapters

P Gao, S Geng, R Zhang, T Ma, R Fang… - International Journal of …, 2024 - Springer

Large-scale contrastive vision-language pretraining has shown significant progress in visual
representation learning. Unlike traditional visual systems trained by a fixed set of discrete …

被引用次数：597 相关文章所有 10 个版本

[PDF] arxiv.org

Deep class-incremental learning: A survey

DW Zhou, QW Wang, ZH Qi, HJ Ye, DC Zhan… - arXiv preprint arXiv …, 2023 - arxiv.org

Deep models, eg, CNNs and Vision Transformers, have achieved impressive achievements
in many vision tasks in the closed world. However, novel classes emerge from time to time in …

被引用次数：148 相关文章所有 7 个版本

[PDF] neurips.cc

Vpgtrans: Transfer visual prompt generator across llms

A Zhang, H Fei, Y Yao, W Ji, L Li… - Advances in Neural …, 2024 - proceedings.neurips.cc

Since developing a new multimodal LLM (MLLM) by pre-training on tremendous image-text
pairs from scratch can be exceedingly resource-consuming, connecting an existing LLM with …

被引用次数：46 相关文章所有 5 个版本

[PDF] neurips.cc

Graphadapter: Tuning vision-language models with dual knowledge graph

X Li, D Lian, Z Lu, J Bai, Z Chen… - Advances in Neural …, 2024 - proceedings.neurips.cc

Adapter-style efficient transfer learning (ETL) has shown excellent performance in the tuning
of vision-language models (VLMs) under the low-data regime, where only a few additional …

被引用次数：24 相关文章所有 5 个版本

[PDF] neurips.cc

Ranpac: Random projections and pre-trained models for continual learning

MD McDonnell, D Gong, A Parvaneh… - Advances in …, 2024 - proceedings.neurips.cc

Continual learning (CL) aims to incrementally learn different tasks (such as classification) in
a non-stationary data stream without forgetting old ones. Most CL works focus on tackling …

被引用次数：23 相关文章所有 6 个版本

[PDF] aaai.org

Fact: Factor-tuning for lightweight adaptation on vision transformer

S Jie, ZH Deng - Proceedings of the AAAI conference on artificial …, 2023 - ojs.aaai.org

Recent work has explored the potential to adapt a pre-trained vision transformer (ViT) by
updating only a few parameters so as to improve storage efficiency, called parameter …

被引用次数：57 相关文章所有 4 个版本

[PDF] thecvf.com

Difffit: Unlocking transferability of large diffusion models via simple parameter-efficient fine-tuning

E Xie, L Yao, H Shi, Z Liu, D Zhou… - Proceedings of the …, 2023 - openaccess.thecvf.com

Diffusion models have proven to be highly effective in generating high-quality images.
However, adapting large pre-trained diffusion models to new domains remains an open …

被引用次数：29 相关文章所有 8 个版本

高级搜索

QQ 群