Distributed learning is envisioned as the bedrock of next-generation intelligent networks, where intelligent agents, such as mobile devices, robots, and sensors, exchange information …
Scalable training of large models (like BERT and GPT-3) requires careful optimization rooted in model design, architecture, and system capabilities. From a system standpoint …
S Chen, C Shen, L Zhang… - IEEE Transactions on …, 2021 - ieeexplore.ieee.org
Communication is widely known as the primary bottleneck of federated learning, and quantization of local model updates before uploading to the parameter server is an effective …
Federated learning is a powerful distributed learning scheme that allows numerous edge devices to collaboratively train a model without sharing their data. However, training is …
X Wei, C Shen - IEEE Transactions on Cognitive …, 2022 - ieeexplore.ieee.org
Does Federated Learning (FL) work when both uplink and downlink communications have errors? How much communication noise can FL handle and what is its impact on the …
J Zhang, Z Qu, C Chen, H Wang, Y Zhan, B Ye… - ACM Computing …, 2021 - dl.acm.org
Machine Learning (ML) has demonstrated great promise in various fields, eg, self-driving, smart city, which are fundamentally altering the way individuals and organizations live, work …
C Chen, G Yao, C Wang, S Goudos, S Wan - Digital Communications and …, 2022 - Elsevier
Academic and industrial communities have been paying significant attention to the 6th Generation (6G) wireless communication systems after the commercial deployment of 5G …
We propose TopoOpt, a novel direct-connect fabric for deep neural network (DNN) training workloads. TopoOpt co-optimizes the distributed training process across three dimensions …
B Wan, J Zhao, C Wu - Proceedings of Machine Learning …, 2023 - proceedings.mlsys.org
Distributed full-graph training of Graph Neural Networks (GNNs) over large graphs is bandwidth-demanding and time-consuming. Frequent exchanges of node features …