Just one byte (per gradient): A note on low-bandwidth decentralized language model finetuning...

T Chu, Z Song, C Yang - Proceedings of the AAAI Conference on …, 2024 - ojs.aaai.org

The softmax operator is a crucial component of large language models (LLMs), which have
played a transformative role in computer research. Due to the centrality of the softmax …

被引用次数：40 相关文章所有 3 个版本

[PDF] arxiv.org

Federated full-parameter tuning of billion-sized language models with communication cost under 18 kilobytes

Z Qin, D Chen, B Qian, B Ding, Y Li, S Deng - arXiv preprint arXiv …, 2023 - arxiv.org

Pre-trained large language models (LLMs) require fine-tuning to improve their
responsiveness to natural language instructions. Federated learning (FL) offers a way to …

被引用次数：27 相关文章所有 3 个版本

[PDF] arxiv.org

Zero-th order algorithm for softmax attention optimization

Y Deng, Z Li, S Mahadevan, Z Song - arXiv preprint arXiv:2307.08352, 2023 - arxiv.org

Large language models (LLMs) have brought about significant transformations in human
society. Among the crucial computations in LLMs, the softmax unit holds great importance …

被引用次数：10 相关文章所有 4 个版本

[PDF] openreview.net

DPZero: dimension-independent and differentially private zeroth-order optimization

L Zhang, KK Thekumparampil, S Oh… - International Workshop on …, 2023 - openreview.net

The widespread practice of fine-tuning pretrained large language models (LLMs) on domain-
specific data faces two major challenges in memory and privacy. First, as the size of LLMs …

被引用次数：8 相关文章所有 3 个版本

[PDF] arxiv.org

ZooPFL: Exploring black-box foundation models for personalized federated learning

W Lu, H Yu, J Wang, D Teney, H Wang, Y Chen… - arXiv preprint arXiv …, 2023 - arxiv.org

When personalized federated learning (FL) meets large foundation models, new challenges
arise from various limitations in resources. In addition to typical limitations such as data …

被引用次数：9 相关文章所有 3 个版本

[PDF] arxiv.org

Fine-tuning and deploying large language models over edges: Issues and approaches

Y Dong, H Zhang, C Li, S Guo, V Leung… - arXiv preprint arXiv …, 2024 - arxiv.org

Since the invention of GPT2--1.5 B in 2019, large language models (LLMs) have
transitioned from specialized models to versatile foundation models. The LLMs exhibit …

被引用次数：2 相关文章所有 2 个版本

[PDF] arxiv.org

Ferret: Federated Full-Parameter Tuning at Scale for Large Language Models

Y Shu, W Hu, SK Ng, BKH Low, FR Yu - arXiv preprint arXiv:2409.06277, 2024 - arxiv.org

Large Language Models (LLMs) have become indispensable in numerous real-world
applications. Unfortunately, fine-tuning these models at scale, especially in federated …

Communication-Efficient Byzantine-Resilient Federated Zero-Order Optimization

ASD Neto, M Egger, M Bakshi, R Bitar - arXiv preprint arXiv:2406.14362, 2024 - arxiv.org

We introduce CYBER-0, the first zero-order optimization algorithm for memory-and-
communication efficient Federated Learning, resilient to Byzantine faults. We show through …

[PDF] arxiv.org

Stochastic Two Points Method for Deep Model Zeroth-order Optimization

Y Pang, J Zhou - arXiv preprint arXiv:2402.01621, 2024 - arxiv.org

Large foundation models, such as large language models, have performed exceptionally
well in various application scenarios. Building or fully fine-tuning such large models is …

Random Masking Finds Winning Tickets for Parameter Efficient Fine-tuning

J Xu, J Zhang - arXiv preprint arXiv:2405.02596, 2024 - arxiv.org

Fine-tuning large language models (LLM) can be costly. Parameter-efficient fine-tuning
(PEFT) addresses the problems by training a fraction of the parameters, whose success …

被引用次数：9 相关文章所有 3 个版本

高级搜索

QQ 群