相关文章- 学术资源搜索

A linearly convergent algorithm for decentralized optimization: Sending less bits for free!

D Kovalev, A Koloskova, M Jaggi… - International …, 2021 - proceedings.mlr.press

Decentralized optimization methods enable on-device training of machine learning models
without a central coordinator. In many scenarios communication between devices is energy …

被引用次数：73 相关文章所有 8 个版本

[PDF] arxiv.org

A DAG model of synchronous stochastic gradient descent in distributed deep learning

S Shi, Q Wang, X Chu, B Li - 2018 IEEE 24th International …, 2018 - ieeexplore.ieee.org

With huge amounts of training data, deep learning has made great breakthroughs in many
artificial intelligence (AI) applications. However, such large-scale data sets present …

被引用次数：28 相关文章所有 11 个版本

[PDF] nsf.gov

Matcha: A matching-based link scheduling strategy to speed up distributed optimization

J Wang, AK Sahu, G Joshi, S Kar - IEEE Transactions on Signal …, 2022 - ieeexplore.ieee.org

In this paper, we study the problem of distributed optimization using an arbitrary network of
lightweight computing nodes, where each node can only send/receive information to/from its …

被引用次数：10 相关文章所有 6 个版本

Minimizing training time of distributed machine learning by reducing data communication

Y Duan, N Wang, J Wu - IEEE Transactions on Network …, 2021 - ieeexplore.ieee.org

Due to the additive property of most machine learning objective functions, the training can
be distributed to multiple machines. Distributed machine learning is an efficient way to deal …

被引用次数：16 相关文章所有 2 个版本

[PDF] mlr.press

Decentralized SGD and average-direction SAM are asymptotically equivalent

T Zhu, F He, K Chen, M Song… - … Conference on Machine …, 2023 - proceedings.mlr.press

Decentralized stochastic gradient descent (D-SGD) allows collaborative learning on
massive devices simultaneously without the control of a central server. However, existing …

被引用次数：9 相关文章所有 12 个版本

[PDF] arxiv.org

Local adaalter: Communication-efficient stochastic gradient descent with adaptive learning rates

C Xie, O Koyejo, I Gupta, H Lin - arXiv preprint arXiv:1911.09030, 2019 - arxiv.org

When scaling distributed training, the communication overhead is often the bottleneck. In
this paper, we propose a novel SGD variant with reduced communication and adaptive …

被引用次数：45 相关文章所有 8 个版本

Dynamic aggregation for heterogeneous quantization in federated learning

S Chen, C Shen, L Zhang… - IEEE Transactions on …, 2021 - ieeexplore.ieee.org

Communication is widely known as the primary bottleneck of federated learning, and
quantization of local model updates before uploading to the parameter server is an effective …

被引用次数：65 相关文章所有 2 个版本

[PDF] arxiv.org

Stochastic distributed learning with gradient quantization and double-variance reduction

S Horváth, D Kovalev, K Mishchenko… - Optimization Methods …, 2023 - Taylor & Francis

We consider distributed optimization over several devices, each sending incremental model
updates to a central server. This setting is considered, for instance, in federated learning …

被引用次数：175 相关文章所有 13 个版本

[PDF] arxiv.org

Multi-tier federated learning for vertically partitioned data

A Das, S Patterson - ICASSP 2021-2021 IEEE International …, 2021 - ieeexplore.ieee.org

We consider decentralized model training in tiered communication networks. Our network
model consists of a set of silos, each holding a vertical partition of the data. Each silo …

被引用次数：19 相关文章所有 4 个版本

[PDF] arxiv.org

Decentralized federated learning with unreliable communications

H Ye, L Liang, GY Li - IEEE journal of selected topics in signal …, 2022 - ieeexplore.ieee.org

Decentralized federated learning, inherited from decentralized learning, enables the edge
devices to collaborate on model training in a peer-to-peer manner without the assistance of …

被引用次数：92 相关文章所有 4 个版本

高级搜索

QQ 群

A linearly convergent algorithm for decentralized optimization: Sending less bits for free!

A DAG model of synchronous stochastic gradient descent in distributed deep learning

Matcha: A matching-based link scheduling strategy to speed up distributed optimization

Minimizing training time of distributed machine learning by reducing data communication

Decentralized SGD and average-direction SAM are asymptotically equivalent

Local adaalter: Communication-efficient stochastic gradient descent with adaptive learning rates

Dynamic aggregation for heterogeneous quantization in federated learning

Stochastic distributed learning with gradient quantization and double-variance reduction

Multi-tier federated learning for vertically partitioned data

Decentralized federated learning with unreliable communications

引用