F Zeng, W Gan, Y Wang… - 2023 IEEE 29th …, 2023 - ieeexplore.ieee.org
The advent of large language models (LLMs), like ChatGPT ushers in revolutionary opportunities that bring a vast variety of applications (such as healthcare, law, and …
Graph neural networks (GNNs) have emerged due to their success at modeling graph data. Yet, it is challenging for GNNs to efficiently scale to large graphs. Thus, distributed GNNs …
In a vertical federated learning (VFL) scenario where features and models are split into different parties, it has been shown that sample-level gradient information can be exploited …
The current trend in deep learning is to scale models to extremely large sizes with the objective of increasing their accuracy. Mixture-of-Expert (MoE) is the most popular pre …
Y Liao, Y Xu, H Xu, L Wang… - IEEE INFOCOM 2023-IEEE …, 2023 - ieeexplore.ieee.org
Data generated at the network edge can be processed locally by leveraging the paradigm of edge computing (EC). Aided by EC, decentralized federated learning (DFL), which …
The scale of deep learning nowadays calls for efficient distributed training algorithms. Decentralized momentum SGD (DmSGD), in which each node averages only with its …
X Deng, T Sun, S Li, D Li - Proceedings of the AAAI Conference on …, 2023 - ojs.aaai.org
The generalization ability often determines the success of machine learning algorithms in practice. Therefore, it is of great theoretical and practical importance to understand and …
M Even, A Koloskova… - … Conference on Artificial …, 2024 - proceedings.mlr.press
Decentralized and asynchronous communications are two popular techniques to speedup communication complexity of distributed machine learning, by respectively removing the …
All-reduce is the key communication primitive used in distributed data-parallel training due to the high performance in the homogeneous environment. However, All-reduce is sensitive …