Topologies in distributed machine learning: Comprehensive survey, recommendations and future directions

L Liu, P Zhou, G Sun, X Chen, T Wu, H Yu, M Guizani - Neurocomputing, 2023 - Elsevier
With the widespread use of distributed machine learning (DML), many IT companies have
established networks dedicated to DML. Different communication architectures of DML have …

COFEL: Communication-efficient and optimized federated learning with local differential privacy

Z Lian, W Wang, C Su - ICC 2021-IEEE International …, 2021 - ieeexplore.ieee.org
Federated learning can collaboratively train a global model without gathering clients' private
data. Many works focus on reducing communication cost by designing kinds of client …

One more config is enough: Saving (DC) TCP for high-speed extremely shallow-buffered datacenters

W Bai, S Hu, K Chen, K Tan… - IEEE/ACM Transactions …, 2020 - ieeexplore.ieee.org
The link speed in production datacenters is growing fast, from 1 Gbps to 40 Gbps or even
100 Gbps. However, the buffer size of commodity switches increases slowly, eg, from 4 MB …

Communication-Efficient Large-Scale Distributed Deep Learning: A Comprehensive Survey

F Liang, Z Zhang, H Lu, V Leung, Y Guo… - arXiv preprint arXiv …, 2024 - arxiv.org
With the rapid growth in the volume of data sets, models, and devices in the domain of deep
learning, there is increasing attention on large-scale distributed deep learning. In contrast to …

GRID: Gradient routing with in-network aggregation for distributed training

J Fang, G Zhao, H Xu, C Wu… - IEEE/ACM Transactions on …, 2023 - ieeexplore.ieee.org
As the scale of distributed training increases, it brings huge communication overhead in
clusters. Some works try to reduce the communication cost through gradient compression or …

Communication optimization algorithms for distributed deep learning systems: A survey

E Yu, D Dong, X Liao - IEEE Transactions on Parallel and …, 2023 - ieeexplore.ieee.org
Deep learning's widespread adoption in various fields has made distributed training across
multiple computing nodes essential. However, frequent communication between nodes can …

Layer-based communication-efficient federated learning with privacy preservation

Z Lian, W Wang, H Huang, C Su - IEICE TRANSACTIONS on …, 2022 - search.ieice.org
In recent years, federated learning has attracted more and more attention as it could
collaboratively train a global model without gathering the users' raw data. It has brought …

Predictive gan-powered multi-objective optimization for hybrid federated split learning

B Yin, Z Chen, M Tao - IEEE Transactions on Communications, 2023 - ieeexplore.ieee.org
As an edge intelligence algorithm for multi-device collaborative training, federated learning
(FL) can protect data privacy but increase the computing load of wireless devices. In …

Ubinn: a communication efficient framework for distributed machine learning in edge computing

K Li, K Chen, S Luo, H Zhang… - IEEE Transactions on …, 2023 - ieeexplore.ieee.org
Deployment of distributed machine learning at the edge is conducive to reducing latency
and protecting privacy associated with transmitting data back to the cloud. Nonetheless, as …

SOAR: Minimizing network utilization with bounded in-network computing

R Segal, C Avin, G Scalosub - … of the 17th International Conference on …, 2021 - dl.acm.org
In-network computing via smart networking devices is a recent trend for modern datacenter
networks. State-of-the-art switches with near line rate computing and aggregation …