- 学术资源搜索

Recent advances in stochastic gradient descent in deep learning

Y Tian, Y Zhang, H Zhang - Mathematics, 2023 - mdpi.com

In the age of artificial intelligence, the best approach to handling huge amounts of data is a
tremendously motivating and hard problem. Among machine learning models, stochastic …

被引用次数：68 相关文章所有 5 个版本

[PDF] mlr.press

Federated learning with buffered asynchronous aggregation

J Nguyen, K Malik, H Zhan… - International …, 2022 - proceedings.mlr.press

Scalability and privacy are two critical concerns for cross-device federated learning (FL)
systems. In this work, we identify that synchronous FL–cannot scale efficiently beyond a few …

被引用次数：271 相关文章所有 4 个版本

[PDF] arxiv.org

Federated learning with non-iid data

Y Zhao, M Li, L Lai, N Suda, D Civin… - arXiv preprint arXiv …, 2018 - arxiv.org

Federated learning enables resource-constrained edge compute devices, such as mobile
phones and IoT devices, to learn a shared model for prediction, while keeping the training …

被引用次数：2840 相关文章所有 2 个版本

[PDF] neurips.cc

Adaptive methods for nonconvex optimization

M Zaheer, S Reddi, D Sachan… - Advances in neural …, 2018 - proceedings.neurips.cc

Adaptive gradient methods that rely on scaling gradients down by the square root of
exponential moving averages of past squared gradients, such RMSProp, Adam, Adadelta …

被引用次数：486 相关文章所有 7 个版本

[PDF] arxiv.org

Federated optimization: Distributed machine learning for on-device intelligence

J Konečný, HB McMahan, D Ramage… - arXiv preprint arXiv …, 2016 - arxiv.org

We introduce a new and increasingly relevant setting for distributed optimization in machine
learning, where the data defining the optimization are unevenly distributed over an …

被引用次数：2209 相关文章所有 8 个版本

[PDF] neurips.cc

Gradient sparsification for communication-efficient distributed optimization

J Wangni, J Wang, J Liu… - Advances in Neural …, 2018 - proceedings.neurips.cc

Modern large-scale machine learning applications require stochastic optimization
algorithms to be implemented on distributed computational architectures. A key bottleneck is …

被引用次数：664 相关文章所有 12 个版本

[PDF] arxiv.org

Revisiting distributed synchronous SGD

J Chen, X Pan, R Monga, S Bengio… - arXiv preprint arXiv …, 2016 - arxiv.org

Distributed training of deep learning models on large-scale training data is typically
conducted with asynchronous stochastic optimization to maximize the rate of updates, at the …

被引用次数：934 相关文章所有 16 个版本

[PDF] mlr.press

Stochastic variance reduction for nonconvex optimization

SJ Reddi, A Hefny, S Sra, B Poczos… - … on machine learning, 2016 - proceedings.mlr.press

We study nonconvex finite-sum problems and analyze stochastic variance reduced gradient
(SVRG) methods for them. SVRG and related methods have recently surged into …

被引用次数：673 相关文章所有 13 个版本

[PDF] ieee.org

Federated variance-reduced stochastic gradient descent with robustness to byzantine attacks

Z Wu, Q Ling, T Chen… - IEEE Transactions on …, 2020 - ieeexplore.ieee.org

This paper deals with distributed finite-sum optimization for learning over multiple workers in
the presence of malicious Byzantine attacks. Most resilient approaches so far combine …

被引用次数：205 相关文章所有 7 个版本

[PDF] neurips.cc

Proximal stochastic methods for nonsmooth nonconvex finite-sum optimization

SJ Reddi, S Sra, B Poczos… - Advances in neural …, 2016 - proceedings.neurips.cc

We analyze stochastic algorithms for optimizing nonconvex, nonsmooth finite-sum problems,
where the nonsmooth part is convex. Surprisingly, unlike the smooth case, our knowledge of …

被引用次数：247 相关文章所有 8 个版本

高级搜索

QQ 群

Recent advances in stochastic gradient descent in deep learning

Federated learning with buffered asynchronous aggregation

Federated learning with non-iid data

Adaptive methods for nonconvex optimization

Federated optimization: Distributed machine learning for on-device intelligence

Gradient sparsification for communication-efficient distributed optimization

Revisiting distributed synchronous SGD

Stochastic variance reduction for nonconvex optimization

Federated variance-reduced stochastic gradient descent with robustness to byzantine attacks

Proximal stochastic methods for nonsmooth nonconvex finite-sum optimization

引用