Proxskip: Yes! local gradient steps provably lead to communication acceleration! finally!

K Mishchenko, G Malinovsky, S Stich… - International …, 2022 - proceedings.mlr.press
We introduce ProxSkip—a surprisingly simple and provably efficient method for minimizing
the sum of a smooth ($ f $) and an expensive nonsmooth proximable ($\psi $) function. The …

Accelerated primal-dual gradient method for smooth and convex-concave saddle-point problems with bilinear coupling

D Kovalev, A Gasnikov… - Advances in Neural …, 2022 - proceedings.neurips.cc
In this paper we study the convex-concave saddle-point problem $\min_x\max_y f (x)+
y^\top\mathbf {A} xg (y) $, where $ f (x) $ and $ g (y) $ are smooth and convex functions. We …

Sharper rates for separable minimax and finite sum optimization via primal-dual extragradient methods

Y Jin, A Sidford, K Tian - Conference on Learning Theory, 2022 - proceedings.mlr.press
We design accelerated algorithms with improved rates for several fundamental classes of
optimization problems. Our algorithms all build upon techniques related to the analysis of …

Optimal algorithms for decentralized stochastic variational inequalities

D Kovalev, A Beznosikov, A Sadiev… - Advances in …, 2022 - proceedings.neurips.cc
Variational inequalities are a formalism that includes games, minimization, saddle point, and
equilibrium problems as special cases. Methods for variational inequalities are therefore …

Revisiting optimal convergence rate for smooth and non-convex stochastic decentralized optimization

K Yuan, X Huang, Y Chen, X Zhang… - Advances in Neural …, 2022 - proceedings.neurips.cc
While numerous effective decentralized algorithms have been proposed with theoretical
guarantees and empirical successes, the performance limits in decentralized optimization …

Lower bounds and optimal algorithms for smooth and strongly convex decentralized optimization over time-varying networks

D Kovalev, E Gasanov, A Gasnikov… - Advances in Neural …, 2021 - proceedings.neurips.cc
We consider the task of minimizing the sum of smooth and strongly convex functions stored
in a decentralized manner across the nodes of a communication network whose links are …

Multi-consensus decentralized accelerated gradient descent

H Ye, L Luo, Z Zhou, T Zhang - Journal of Machine Learning Research, 2023 - jmlr.org
This paper considers the decentralized convex optimization problem, which has a wide
range of applications in large-scale machine learning, sensor networks, and control theory …

RandProx: Primal-dual optimization algorithms with randomized proximal updates

L Condat, P Richtárik - arXiv preprint arXiv:2207.12891, 2022 - arxiv.org
Proximal splitting algorithms are well suited to solving large-scale nonsmooth optimization
problems, in particular those arising in machine learning. We propose a new primal-dual …

Optimal gradient tracking for decentralized optimization

Z Song, L Shi, S Pu, M Yan - Mathematical Programming, 2024 - Springer
In this paper, we focus on solving the decentralized optimization problem of minimizing the
sum of n objective functions over a multi-agent network. The agents are embedded in an …

Communication acceleration of local gradient methods via an accelerated primal-dual algorithm with an inexact prox

A Sadiev, D Kovalev… - Advances in Neural …, 2022 - proceedings.neurips.cc
Inspired by a recent breakthrough of Mishchenko et al.[2022], who for the first time showed
that local gradient steps can lead to provable communication acceleration, we propose an …