Optimal regularized dual averaging methods for stochastic optimization

H Yuan, M Zaheer, S Reddi - International Conference on …, 2021 - proceedings.mlr.press

Federated Learning (FL) is a distributed learning paradigm that scales on-device learning
collaboratively and privately. Standard FL algorithms such as FEDAVG are primarily geared …

被引用次数：68 相关文章所有 7 个版本

[PDF] neurips.cc

Variance reduction for stochastic gradient optimization

C Wang, X Chen, AJ Smola… - Advances in neural …, 2013 - proceedings.neurips.cc

Stochastic gradient optimization is a class of widely used algorithms for training machine
learning models. To optimize an objective, it uses the noisy gradient computed from the …

被引用次数：166 相关文章所有 14 个版本

[PDF] arxiv.org

An asynchronous mini-batch algorithm for regularized stochastic optimization

HR Feyzmahdavian, A Aytekin… - IEEE Transactions on …, 2016 - ieeexplore.ieee.org

Mini-batch optimization has proven to be a powerful paradigm for large-scale learning.
However, the state-of-the-art parallel mini-batch algorithms assume synchronous operation …

被引用次数：157 相关文章所有 10 个版本

[PDF] mlr.press

Fast composite optimization and statistical recovery in federated learning

Y Bao, M Crawshaw, S Luo… - … Conference on Machine …, 2022 - proceedings.mlr.press

As a prevalent distributed learning paradigm, Federated Learning (FL) trains a global model
on a massive amount of devices with infrequent communication. This paper investigates a …

被引用次数：16 相关文章所有 5 个版本

[PDF] jmlr.org

Rsg: Beating subgradient method without smoothness and strong convexity

T Yang, Q Lin - Journal of Machine Learning Research, 2018 - jmlr.org

In this paper, we study the efficiency of a Restarted SubGradient (RSG) method that
periodically restarts the standard subgradient method (SG). We show that, when applied to a …

被引用次数：103 相关文章所有 8 个版本

[PDF] neurips.cc

A universally optimal multistage accelerated stochastic gradient method

NS Aybat, A Fallah… - Advances in neural …, 2019 - proceedings.neurips.cc

We study the problem of minimizing a strongly convex, smooth function when we have noisy
estimates of its gradient. We propose a novel multistage accelerated algorithm that is …

被引用次数：63 相关文章所有 14 个版本

[PDF] mlr.press

A simpler approach to accelerated optimization: iterative averaging meets optimism

P Joulani, A Raj, A Gyorgy… - … conference on machine …, 2020 - proceedings.mlr.press

Recently there have been several attempts to extend Nesterov's accelerated algorithm to
smooth stochastic and variance-reduced optimization. In this paper, we show that there is a …

被引用次数：39 相关文章所有 7 个版本

[PDF] mlr.press

Towards an optimal stochastic alternating direction method of multipliers

S Azadi, S Sra - International Conference on Machine …, 2014 - proceedings.mlr.press

We study regularized stochastic convex optimization subject to linear equality constraints.
This class of problems was recently also studied by Ouyang et al.(2013) and Suzuki (2013); …

被引用次数：74 相关文章所有 7 个版本

[PDF] arxiv.org

On the convergence of mirror descent beyond stochastic convex programming

Z Zhou, P Mertikopoulos, N Bambos, SP Boyd… - SIAM Journal on …, 2020 - SIAM

In this paper, we examine the convergence of mirror descent in a class of stochastic
optimization problems that are not necessarily convex (or even quasi-convex) and which we …

被引用次数：43 相关文章所有 16 个版本

[PDF] arxiv.org

The strength of Nesterov's extrapolation in the individual convergence of nonsmooth optimization

W Tao, Z Pan, G Wu, Q Tao - IEEE Transactions on Neural …, 2019 - ieeexplore.ieee.org

The extrapolation strategy raised by Nesterov, which can accelerate the convergence rate of
gradient descent methods by orders of magnitude when dealing with smooth convex …

被引用次数：28 相关文章所有 6 个版本

高级搜索

QQ 群