Moniqua: Modulo quantized communication in decentralized SGD

Y Lu, C De Sa - International Conference on Machine …, 2020 - proceedings.mlr.press
Abstract Running Stochastic Gradient Descent (SGD) in a decentralized fashion has shown
promising results. In this paper we propose Moniqua, a technique that allows decentralized …

Overlap local-SGD: An algorithmic approach to hide communication delays in distributed SGD

J Wang, H Liang, G Joshi - ICASSP 2020-2020 IEEE …, 2020 - ieeexplore.ieee.org
Distributed stochastic gradient descent (SGD) is essential for scaling the machine learning
algorithms to a large number of computing nodes. However, the infrastructures variability …

Communication-compressed adaptive gradient method for distributed nonconvex optimization

Y Wang, L Lin, J Chen - International Conference on Artificial …, 2022 - proceedings.mlr.press
Due to the explosion in the size of the training datasets, distributed learning has received
growing interest in recent years. One of the major bottlenecks is the large communication …

[引用][C] Qsgd: Randomized quantization for communication-optimal stochastic gradient descent

D Alistarh, J Li, R Tomioka, M Vojnovic - arXiv preprint arXiv:1610.02132, 2016 - Oct

Codedreduce: A fast and robust framework for gradient aggregation in distributed learning

A Reisizadeh, S Prakash, R Pedarsani… - IEEE/ACM …, 2021 - ieeexplore.ieee.org
We focus on the commonly used synchronous Gradient Descent paradigm for large-scale
distributed learning, for which there has been a growing interest to develop efficient and …

Qsparse-local-SGD: Distributed SGD with quantization, sparsification and local computations

D Basu, D Data, C Karakus… - Advances in Neural …, 2019 - proceedings.neurips.cc
Communication bottleneck has been identified as a significant issue in distributed
optimization of large-scale learning models. Recently, several approaches to mitigate this …

Lazily aggregated quantized gradient innovation for communication-efficient federated learning

J Sun, T Chen, GB Giannakis, Q Yang… - IEEE Transactions on …, 2020 - ieeexplore.ieee.org
This paper focuses on communication-efficient federated learning problem, and develops a
novel distributed quantized gradient approach, which is characterized by adaptive …

Coded stochastic ADMM for decentralized consensus optimization with edge computing

H Chen, Y Ye, M Xiao, M Skoglund… - IEEE Internet of Things …, 2021 - ieeexplore.ieee.org
Big data, including applications with high security requirements, are often collected and
stored on multiple heterogeneous devices, such as mobile devices, drones, and vehicles …

Variance reduced local sgd with lower communication complexity

X Liang, S Shen, J Liu, Z Pan, E Chen… - arXiv preprint arXiv …, 2019 - arxiv.org
To accelerate the training of machine learning models, distributed stochastic gradient
descent (SGD) and its variants have been widely adopted, which apply multiple workers in …

Group-based alternating direction method of multipliers for distributed linear classification

H Wang, Y Gao, Y Shi, R Wang - IEEE transactions on …, 2016 - ieeexplore.ieee.org
The alternating direction method of multipliers (ADMM) algorithm has been widely employed
for distributed machine learning tasks. However, it suffers from several limitations, eg, a …