Decentralization meets quantization

G Nadiradze, A Sabour, P Davies… - Advances in Neural …, 2021 - proceedings.neurips.cc

Decentralized optimization is emerging as a viable alternative for scalable distributed
machine learning, but also introduces new challenges in terms of synchronization costs. To …

被引用次数：46 相关文章所有 9 个版本

[PDF] mlr.press

Quantized distributed training of large models with convergence guarantees

I Markov, A Vladu, Q Guo… - … Conference on Machine …, 2023 - proceedings.mlr.press

Communication-reduction techniques are a popular way to improve scalability in data-
parallel training of deep neural networks (DNNs). The recent emergence of large language …

被引用次数：6 相关文章所有 9 个版本

[PDF] arxiv.org

Compressed distributed gradient descent: Communication-efficient consensus over networks

X Zhang, J Liu, Z Zhu, ES Bentley - IEEE INFOCOM 2019-IEEE …, 2019 - ieeexplore.ieee.org

Network consensus optimization has received increasing attention in recent years and has
found important applications in many scientific and engineering fields. To solve network …

被引用次数：35 相关文章所有 8 个版本

[PDF] arxiv.org

Gradient descent with compressed iterates

A Khaled, P Richtárik - arXiv preprint arXiv:1909.04716, 2019 - arxiv.org

We propose and analyze a new type of stochastic first order method: gradient descent with
compressed iterates (GDCI). GDCI in each iteration first compresses the current iterate using …

被引用次数：28 相关文章所有 7 个版本

Communication-censored distributed stochastic gradient descent

W Li, Z Wu, T Chen, L Li, Q Ling - IEEE Transactions on Neural …, 2021 - ieeexplore.ieee.org

This article develops a communication-efficient algorithm to solve the stochastic optimization
problem defined over a distributed network, aiming at reducing the burdensome …

被引用次数：18 相关文章所有 3 个版本

[PDF] nsf.gov

Decentralized dynamic admm with quantized and censored communications

Y Liu, K Yuan, G Wu, Z Tian… - 2019 53rd Asilomar …, 2019 - ieeexplore.ieee.org

In this paper, we develop a quantized and communication-censored alternating direction
method of multipliers (ADMM) to solve a dynamic optimization problem defined over a …

被引用次数：15 相关文章所有 2 个版本

[PDF] ista.ac.at

On achieving scalability through relaxation

G Nadiradze - 2021 - research-explorer.ista.ac.at

The scalability of concurrent data structures and distributed algorithms strongly depends on
reducing the contention for shared resources and the costs of synchronization and …

被引用次数：6 相关文章所有 2 个版本

[PDF] openreview.net

Decentralized SGD with asynchronous, local and quantized updates

G Nadiradze, A Sabour, P Davies, I Markov, S Li… - 2019 - openreview.net

The ability to scale distributed optimization to large node counts has been one of the main
enablers of recent progress in machine learning. To this end, several techniques have been …

被引用次数：5 相关文章

[PDF] arxiv.org

Hybrid Decentralized Optimization: First-and Zeroth-Order Optimizers Can Be Jointly Leveraged For Faster Convergence

S Talaei, G Nadiradze, D Alistarh - arXiv preprint arXiv:2210.07703, 2022 - arxiv.org

Distributed optimization has become one of the standard ways of speeding up machine
learning training, and most of the research in the area focuses on distributed first-order …

Robust communication strategy for federated learning by incorporating self-attention

Y Xu, X Li, Z Yang, HJ Song - 2020 International Conference …, 2020 - spiedigitallibrary.org

Federated learning is an emerging machine learning setting, which can train a shared
model on large amounts of decentralized data while protecting data privacy. However, the …

被引用次数：3 相关文章所有 3 个版本

高级搜索

QQ 群