AI-based fog and edge computing: A systematic review, taxonomy and future directions

S Iftikhar, SS Gill, C Song, M Xu, MS Aslanpour… - Internet of Things, 2023 - Elsevier
Resource management in computing is a very challenging problem that involves making
sequential decisions. Resource limitations, resource heterogeneity, dynamic and diverse …

Offloading machine learning to programmable data planes: A systematic survey

R Parizotto, BL Coelho, DC Nunes, I Haque… - ACM Computing …, 2023 - dl.acm.org
The demand for machine learning (ML) has increased significantly in recent decades,
enabling several applications, such as speech recognition, computer vision, and …

Cocktailsgd: Fine-tuning foundation models over 500mbps networks

J Wang, Y Lu, B Yuan, B Chen… - International …, 2023 - proceedings.mlr.press
Distributed training of foundation models, especially large language models (LLMs), is
communication-intensive and so has heavily relied on centralized data centers with fast …

Scaling distributed machine learning with {In-Network} aggregation

A Sapio, M Canini, CY Ho, J Nelson, P Kalnis… - … USENIX Symposium on …, 2021 - usenix.org
Training machine learning models in parallel is an increasingly important workload. We
accelerate distributed parallel training by designing a communication primitive that uses a …

Vulnerabilities in federated learning

N Bouacida, P Mohapatra - IEEE Access, 2021 - ieeexplore.ieee.org
With more regulations tackling the protection of users' privacy-sensitive data in recent years,
access to such data has become increasingly restricted. A new decentralized training …

EF21: A new, simpler, theoretically better, and practically faster error feedback

P Richtárik, I Sokolov… - Advances in Neural …, 2021 - proceedings.neurips.cc
Error feedback (EF), also known as error compensation, is an immensely popular
convergence stabilization mechanism in the context of distributed training of supervised …

Compute-efficient deep learning: Algorithmic trends and opportunities

BR Bartoldson, B Kailkhura, D Blalock - Journal of Machine Learning …, 2023 - jmlr.org
Although deep learning has made great progress in recent years, the exploding economic
and environmental costs of training neural networks are becoming unsustainable. To …

Collaborative optimization and aggregation for decentralized domain generalization and adaptation

G Wu, S Gong - Proceedings of the IEEE/CVF International …, 2021 - openaccess.thecvf.com
Contemporary domain generalization (DG) and multi-source unsupervised domain
adaptation (UDA) methods mostly collect data from multiple domains together for joint …

Grace: A compressed communication framework for distributed machine learning

H Xu, CY Ho, AM Abdelmoniem, A Dutta… - 2021 IEEE 41st …, 2021 - ieeexplore.ieee.org
Powerful computer clusters are used nowadays to train complex deep neural networks
(DNN) on large datasets. Distributed training increasingly becomes communication bound …

Towards efficient communications in federated learning: A contemporary survey

Z Zhao, Y Mao, Y Liu, L Song, Y Ouyang… - Journal of the Franklin …, 2023 - Elsevier
In the traditional distributed machine learning scenario, the user's private data is transmitted
between clients and a central server, which results in significant potential privacy risks. In …