Decentralized training of foundation models in heterogeneous environments

Llm-based edge intelligence: A comprehensive survey on architectures, applications, security and trustworthiness

O Friha, MA Ferrag, B Kantarci… - IEEE Open Journal …, 2024 - ieeexplore.ieee.org

The integration of Large Language Models (LLMs) and Edge Intelligence (EI) introduces a
groundbreaking paradigm for intelligent edge devices. With their capacity for human-like …

被引用次数：24 相关文章所有 2 个版本

[PDF] arxiv.org

On efficient training of large-scale deep learning models: A literature review

L Shen, Y Sun, Z Yu, L Ding, X Tian, D Tao - arXiv preprint arXiv …, 2023 - arxiv.org

The field of deep learning has witnessed significant progress, particularly in computer vision
(CV), natural language processing (NLP), and speech. The use of large-scale models …

被引用次数：40 相关文章所有 2 个版本

[PDF] mlr.press

Cocktailsgd: Fine-tuning foundation models over 500mbps networks

J Wang, Y Lu, B Yuan, B Chen… - International …, 2023 - proceedings.mlr.press

Distributed training of foundation models, especially large language models (LLMs), is
communication-intensive and so has heavily relied on centralized data centers with fast …

被引用次数：37 相关文章所有 4 个版本

[PDF] arxiv.org

When foundation model meets federated learning: Motivations, challenges, and future directions

W Zhuang, C Chen, L Lyu - arXiv preprint arXiv:2306.15546, 2023 - arxiv.org

The intersection of the Foundation Model (FM) and Federated Learning (FL) provides mutual
benefits, presents a unique opportunity to unlock new possibilities in AI research, and …

被引用次数：100 相关文章所有 2 个版本

[PDF] arxiv.org

Decentralized bilevel optimization

X Chen, M Huang, S Ma - Optimization Letters, 2024 - Springer

Bilevel optimization has been successfully applied to many important machine learning
problems. Algorithms for solving bilevel optimization have been studied under various …

被引用次数：49 相关文章所有 4 个版本

[PDF] neurips.cc

Distributed inference and fine-tuning of large language models over the internet

A Borzunov, M Ryabinin… - Advances in …, 2024 - proceedings.neurips.cc

Large language models (LLMs) are useful in many NLP tasks and become more capable
with size, with the best open-source models having over 50 billion parameters. However …

被引用次数：37 相关文章所有 6 个版本

[PDF] mlr.press

Decentralized SGD and average-direction SAM are asymptotically equivalent

T Zhu, F He, K Chen, M Song… - … Conference on Machine …, 2023 - proceedings.mlr.press

Decentralized stochastic gradient descent (D-SGD) allows collaborative learning on
massive devices simultaneously without the control of a central server. However, existing …

被引用次数：13 相关文章所有 11 个版本

[PDF] ubc.ca

Caribou: Fine-Grained Geospatial Shifting of Serverless Applications for Sustainability

VU Gsteiger, PH Long, Y Sun, P Javanrood… - Proceedings of the …, 2024 - dl.acm.org

Sustainability in computing is critical as environmental concerns rise. The cloud industry's
carbon footprint is significant and rapidly growing. We show that dynamic geospatial shifting …

被引用次数：5 相关文章所有 2 个版本

A survey on scheduling techniques in computing and network convergence

S Tang, Y Yu, H Wang, G Wang, W Chen… - … Surveys & Tutorials, 2023 - ieeexplore.ieee.org

The computing demand for massive applications has led to the ubiquitous deployment of
computing power. This trend results in the urgent need for higher-level computing resource …

被引用次数：11 相关文章所有 2 个版本

[PDF] arxiv.org

Compress, then prompt: Improving accuracy-efficiency trade-off of llm inference with transferable prompt

Z Xu, Z Liu, B Chen, Y Tang, J Wang, K Zhou… - arXiv preprint arXiv …, 2023 - arxiv.org

While the numerous parameters in Large Language Models (LLMs) contribute to their
superior performance, this massive scale makes them inefficient and memory-hungry. Thus …

被引用次数：32 相关文章所有 4 个版本

高级搜索

QQ 群