Combining federated learning and edge computing toward ubiquitous intelligence in 6G network: Challenges, recent advances, and future directions

Q Duan, J Huang, S Hu, R Deng… - … Surveys & Tutorials, 2023 - ieeexplore.ieee.org
Full leverage of the huge volume of data generated on a large number of user devices for
providing intelligent services in the 6G network calls for Ubiquitous Intelligence (UI). A key to …

Offloading machine learning to programmable data planes: A systematic survey

R Parizotto, BL Coelho, DC Nunes, I Haque… - ACM Computing …, 2023 - dl.acm.org
The demand for machine learning (ML) has increased significantly in recent decades,
enabling several applications, such as speech recognition, computer vision, and …

In-network aggregation for data center networks: A survey

A Feng, D Dong, F Lei, J Ma, E Yu, R Wang - Computer Communications, 2023 - Elsevier
Aggregation applications are widely deployed in data centers, such as distributed machine
learning and MapReduce-like framework. These applications typically have large …

Accelerating Distributed Training With Collaborative In-Network Aggregation

J Fang, H Xu, G Zhao, Z Yu, B Shen… - IEEE/ACM Transactions …, 2024 - ieeexplore.ieee.org
The surging scale of distributed training (DT) incurs significant communication overhead in
datacenters, while a promising solution is in-network aggregation (INA). It leverages …

GOAT: Gradient scheduling with collaborative in-network aggregation for distributed training

J Fang, G Zhao, H Xu, Z Yu, B Shen… - 2023 IEEE/ACM 31st …, 2023 - ieeexplore.ieee.org
The surging scale of distributed training (DT) incurs significant communication overhead in
datacenters, while a promising solution is in-network aggregation (INA). It leverages …

Straggler-Aware Gradient Aggregation for Large-Scale Distributed Deep Learning System

Y Li, J Huang, Z Li, J Liu, S Zhou… - IEEE/ACM …, 2024 - ieeexplore.ieee.org
Deep Neural Network (DNN) is a critical component of a wide range of applications.
However, with the rapid growth of the training dataset and model size, communication …

Releasing the Power of In-Network Aggregation With Aggregator-Aware Routing Optimization

S Luo, X Yu, K Li, H Xing - IEEE/ACM Transactions on …, 2024 - ieeexplore.ieee.org
By offloading partial of the aggregation computation from the logical central parameter
servers to network devices like programmable switches, In-Network Aggregation (INA) is a …

Host-driven In-Network Aggregation on RDMA

Y Li, W Li, Y Yao, Y Du, K Li - IEEE INFOCOM 2024-IEEE …, 2024 - ieeexplore.ieee.org
Large-scale datacenter networks are increasingly using in-network aggregation (INA) and
remote direct memory access (RDMA) techniques to accelerate deep neural network (DNN) …

Straggler-Aware In-Network Aggregation for Accelerating Distributed Deep Learning

H Lee, J Lee, H Kim, S Pack - IEEE Transactions on Services …, 2023 - ieeexplore.ieee.org
In-network aggregation facilitates accelerated distributed deep learning by utilizing a
programmable switch to aggregate gradient packets. However, a straggler problem should …

No Worker Left (Too Far) Behind: Dynamic Hybrid Synchronization for In‐Network ML Aggregation

D Cardoso Nunes, B Loureiro Coelho… - … Journal of Network …, 2024 - Wiley Online Library
Achieving high‐performance aggregation is essential to scaling data‐parallel distributed
machine learning (ML) training. Recent research in in‐network computing has shown that …