Convergence rate of incremental subgradient algorithms

Convex optimization for big data: Scalable, randomized, and parallel algorithms for big data analytics

V Cevher, S Becker, M Schmidt - IEEE Signal Processing …, 2014 - ieeexplore.ieee.org

This article reviews recent advances in convex optimization algorithms for big data, which
aim to reduce the computational, storage, and communications bottlenecks. We provide an …

被引用次数：372 相关文章所有 15 个版本

[PDF] researchgate.net

Achieving geometric convergence for distributed optimization over time-varying graphs

A Nedic, A Olshevsky, W Shi - SIAM Journal on Optimization, 2017 - SIAM

This paper considers the problem of distributed optimization over time-varying graphs. For
the case of undirected graphs, we introduce a distributed algorithm, referred to as DIGing …

被引用次数：1061 相关文章所有 7 个版本

[PDF] mlr.press

Coresets for data-efficient training of machine learning models

B Mirzasoleiman, J Bilmes… - … Conference on Machine …, 2020 - proceedings.mlr.press

Incremental gradient (IG) methods, such as stochastic gradient descent and its variants are
commonly used for large scale optimization in machine learning. Despite the sustained effort …

被引用次数：377 相关文章所有 13 个版本

[PDF] neurips.cc

Adaptive SGD with Polyak stepsize and line-search: Robust convergence and variance reduction

X Jiang, SU Stich - Advances in Neural Information …, 2024 - proceedings.neurips.cc

The recently proposed stochastic Polyak stepsize (SPS) and stochastic line-search (SLS) for
SGD have shown remarkable effectiveness when training over-parameterized models …

被引用次数：18 相关文章所有 10 个版本

[PDF] academia.edu

[图书][B] Deep learning

Y Bengio, I Goodfellow, A Courville - 2017 - academia.edu

Inventors have long dreamed of creating machines that think. Ancient Greek myths tell of
intelligent objects, such as animated statues of human beings and tables that arrive full of …

被引用次数：2154 相关文章所有 4 个版本

[PDF] arxiv.org

Minimizing finite sums with the stochastic average gradient

M Schmidt, N Le Roux, F Bach - Mathematical Programming, 2017 - Springer

We analyze the stochastic average gradient (SAG) method for optimizing the sum of a finite
number of smooth convex functions. Like stochastic gradient (SG) methods, the SAG …

被引用次数：1471 相关文章所有 23 个版本

[PDF] neurips.cc

Hogwild!: A lock-free approach to parallelizing stochastic gradient descent

B Recht, C Re, S Wright, F Niu - Advances in neural …, 2011 - proceedings.neurips.cc

Abstract Stochastic Gradient Descent (SGD) is a popular algorithm that can achieve state-of-
the-art performance on a variety of machine learning tasks. Several researchers have …

被引用次数：2808 相关文章所有 34 个版本

[PDF] siam.org

[图书][B] First-order methods in optimization

A Beck - 2017 - SIAM

This book, as the title suggests, is about first-order methods, namely, methods that exploit
information on values and gradients/subgradients (but not Hessians) of the functions …

被引用次数：2095 相关文章所有 7 个版本

[PDF] arxiv.org

A stochastic quasi-Newton method for large-scale optimization

RH Byrd, SL Hansen, J Nocedal, Y Singer - SIAM Journal on Optimization, 2016 - SIAM

The question of how to incorporate curvature information into stochastic approximation
methods is challenging. The direct application of classical quasi-Newton updating …

被引用次数：612 相关文章所有 11 个版本

[PDF] neurips.cc

A stochastic gradient method with an exponential convergence _rate for finite training sets

N Roux, M Schmidt, F Bach - Advances in neural …, 2012 - proceedings.neurips.cc

We propose a new stochastic gradient method for optimizing the sum of a finite set of smooth
functions, where the sum is strongly convex. While standard stochastic gradient methods …

被引用次数：1101 相关文章所有 20 个版本

高级搜索

QQ 群