Variance-reduced methods for machine learning

RM Gower, M Schmidt, F Bach… - Proceedings of the …, 2020 - ieeexplore.ieee.org
Stochastic optimization lies at the heart of machine learning, and its cornerstone is
stochastic gradient descent (SGD), a method introduced over 60 years ago. The last eight …

Almost sure convergence rates for stochastic gradient descent and stochastic heavy ball

O Sebbouh, RM Gower… - Conference on Learning …, 2021 - proceedings.mlr.press
We study stochastic gradient descent (SGD) and the stochastic heavy ball method (SHB,
otherwise known as the momentum method) for the general stochastic approximation …

Sgd for structured nonconvex functions: Learning rates, minibatching and interpolation

R Gower, O Sebbouh, N Loizou - … Conference on Artificial …, 2021 - proceedings.mlr.press
Abstract Stochastic Gradient Descent (SGD) is being used routinely for optimizing non-
convex functions. Yet, the standard convergence theory for SGD in the smooth non-convex …

Stochastic hamiltonian gradient methods for smooth games

N Loizou, H Berard… - International …, 2020 - proceedings.mlr.press
The success of adversarial formulations in machine learning has brought renewed
motivation for smooth games. In this work, we focus on the class of stochastic Hamiltonian …

Single-call stochastic extragradient methods for structured non-monotone variational inequalities: Improved analysis under weaker conditions

S Choudhury, E Gorbunov… - Advances in Neural …, 2024 - proceedings.neurips.cc
Single-call stochastic extragradient methods, like stochastic past extragradient (SPEG) and
stochastic optimistic gradient (SOG), have gained a lot of interest in recent years and are …

Unified analysis of stochastic gradient methods for composite convex and smooth optimization

A Khaled, O Sebbouh, N Loizou, RM Gower… - Journal of Optimization …, 2023 - Springer
We present a unified theorem for the convergence analysis of stochastic gradient algorithms
for minimizing a smooth and convex loss plus a convex regularizer. We do this by extending …

An optimal algorithm for decentralized finite-sum optimization

H Hendrikx, F Bach, L Massoulie - SIAM Journal on Optimization, 2021 - SIAM
Modern large-scale finite-sum optimization relies on two key aspects: distribution and
stochastic updates. For smooth and strongly convex problems, existing decentralized …

Faster federated optimization under second-order similarity

A Khaled, C Jin - arXiv preprint arXiv:2209.02257, 2022 - arxiv.org
Federated learning (FL) is a subfield of machine learning where multiple clients try to
collaboratively learn a model over a network under communication constraints. We consider …

Murana: A generic framework for stochastic variance-reduced optimization

L Condat, P Richtárik - Mathematical and Scientific Machine …, 2022 - proceedings.mlr.press
We propose a generic variance-reduced algorithm, which we call MUltiple RANdomized
Algorithm (MURANA), for minimizing a sum of several smooth functions plus a regularizer, in …

SVRG meets adagrad: Painless variance reduction

B Dubois-Taine, S Vaswani, R Babanezhad… - Machine Learning, 2022 - Springer
Variance reduction (VR) methods for finite-sum minimization typically require the knowledge
of problem-dependent constants that are often unknown and difficult to estimate. To address …