H Shi, Y Wang, L Han, H Zhang, H Wang - arXiv preprint arXiv:2412.05723, 2024 - arxiv.org
Estimating the uncertainty of responses of Large Language Models~(LLMs) remains a critical challenge. While recent Bayesian methods have demonstrated effectiveness in …
Q Wang, X Hu, W Xu, W Liu, J Luan, B Wang - arXiv preprint arXiv …, 2024 - arxiv.org
Low-rank adaptation (LoRA) and its variants have recently gained much interest due to their ability to avoid excessive inference costs. However, LoRA still encounters the following …
F Meng, M Zhang - arXiv preprint arXiv:2411.17426, 2024 - arxiv.org
To adapt a well-trained large model to downstream tasks, we propose constraining learning within its original latent space by leveraging linear combinations of its basis vectors. This …
Y Zhong, H Jiang, L Li, R Nakada, T Liu… - arXiv preprint arXiv …, 2024 - arxiv.org
Fine-tuning pre-trained models is crucial for adapting large models to downstream tasks, often delivering state-of-the-art performance. However, fine-tuning all model parameters is …
Recent advances in Large Language Models (LLMs) have led to significant improvements in natural language processing tasks, but their ability to generate human-quality text raises …
Q Sun, E Cetin, Y Tang - arXiv preprint arXiv:2501.06252, 2025 - arxiv.org
Self-adaptive large language models (LLMs) aim to solve the challenges posed by traditional fine-tuning methods, which are often computationally intensive and static in their …
Y Lu, B Qian, C Yuan, H Jiang, X Wang - arXiv preprint arXiv:2410.16801, 2024 - arxiv.org
Large language models (LLMs) exhibit remarkable capabilities in natural language processing but face catastrophic forgetting when learning new tasks, where adaptation to a …
Y Zhong, Y Zhou - arXiv preprint arXiv:2407.09946, 2024 - arxiv.org
Low-rank adaptation (LoRA) is a powerful parameter-efficient fine-tuning method that utilizes low-rank projectors $ A $ and $ B $ to learn weight updates $\Delta W $ for adaptation …
Z Li, M Mak, M Pilanci, H Lee, H Meng - arXiv preprint arXiv:2501.03829, 2025 - arxiv.org
Previous research has shown that the principal singular vectors of a pre-trained model's weight matrices capture critical knowledge. In contrast, those associated with small singular …