- 学术资源搜索

Adaptive SGD with Polyak stepsize and line-search: Robust convergence and variance reduction

X Jiang, SU Stich - Advances in Neural Information …, 2024 - proceedings.neurips.cc

The recently proposed stochastic Polyak stepsize (SPS) and stochastic line-search (SLS) for
SGD have shown remarkable effectiveness when training over-parameterized models …

被引用次数：23 相关文章所有 10 个版本

[PDF] thecvf.com

A weakly supervised consistency-based learning method for covid-19 segmentation in ct images

I Laradji, P Rodriguez, O Manas… - Proceedings of the …, 2021 - openaccess.thecvf.com

Abstract Coronavirus Disease 2019 (COVID-19) has spread aggressively across the world
causing an existential health crisis. Thus, having a system that automatically detects COVID …

被引用次数：104 相关文章所有 10 个版本

[PDF] mlr.press

Almost sure convergence rates for stochastic gradient descent and stochastic heavy ball

O Sebbouh, RM Gower… - Conference on Learning …, 2021 - proceedings.mlr.press

We study stochastic gradient descent (SGD) and the stochastic heavy ball method (SHB,
otherwise known as the momentum method) for the general stochastic approximation …

被引用次数：123 相关文章所有 8 个版本

[PDF] neurips.cc

Dynamics of sgd with stochastic polyak stepsizes: Truly adaptive variants and convergence to exact solution

A Orvieto, S Lacoste-Julien… - Advances in Neural …, 2022 - proceedings.neurips.cc

Abstract Recently Loizou et al.(2021), proposed and analyzed stochastic gradient descent
(SGD) with stochastic Polyak stepsize (SPS). The proposed SPS comes with strong …

被引用次数：36 相关文章所有 8 个版本

[PDF] neurips.cc

Nest your adaptive algorithm for parameter-agnostic nonconvex minimax optimization

J Yang, X Li, N He - Advances in Neural Information …, 2022 - proceedings.neurips.cc

Adaptive algorithms like AdaGrad and AMSGrad are successful in nonconvex optimization
owing to their parameter-agnostic ability–requiring no a priori knowledge about problem …

被引用次数：26 相关文章所有 6 个版本

[PDF] neurips.cc

Amortized proximal optimization

J Bae, P Vicol, JZ HaoChen… - Advances in Neural …, 2022 - proceedings.neurips.cc

We propose a framework for online meta-optimization of parameters that govern
optimization, called Amortized Proximal Optimization (APO). We first interpret various …

被引用次数：17 相关文章所有 8 个版本

[PDF] mlr.press

Towards noise-adaptive, problem-adaptive (accelerated) stochastic gradient descent

S Vaswani, B Dubois-Taine… - … on machine learning, 2022 - proceedings.mlr.press

We aim to make stochastic gradient descent (SGD) adaptive to (i) the noise $\sigma^ 2$ in
the stochastic gradients and (ii) problem-dependent constants. When minimizing smooth …

被引用次数：13 相关文章所有 5 个版本

[PDF] arxiv.org

Why line search when you can plane search? so-friendly neural networks allow per-iteration optimization of learning and momentum rates for every layer

B Shea, M Schmidt - arXiv preprint arXiv:2406.17954, 2024 - arxiv.org

We introduce the class of SO-friendly neural networks, which include several models used in
practice including networks with 2 layers of hidden weights where the number of inputs is …

被引用次数：2 相关文章所有 2 个版本

[PDF] springer.com

SVRG meets adagrad: Painless variance reduction

B Dubois-Taine, S Vaswani, R Babanezhad… - Machine Learning, 2022 - Springer

Variance reduction (VR) methods for finite-sum minimization typically require the knowledge
of problem-dependent constants that are often unknown and difficult to estimate. To address …

被引用次数：23 相关文章所有 6 个版本

[PDF] arxiv.org

Target-based surrogates for stochastic optimization

JW Lavington, S Vaswani, R Babanezhad… - arXiv preprint arXiv …, 2023 - arxiv.org

We consider minimizing functions for which it is expensive to compute the (possibly
stochastic) gradient. Such functions are prevalent in reinforcement learning, imitation …

被引用次数：8 相关文章所有 6 个版本

高级搜索

QQ 群

Adaptive SGD with Polyak stepsize and line-search: Robust convergence and variance reduction

A weakly supervised consistency-based learning method for covid-19 segmentation in ct images

Almost sure convergence rates for stochastic gradient descent and stochastic heavy ball

Dynamics of sgd with stochastic polyak stepsizes: Truly adaptive variants and convergence to exact solution

Nest your adaptive algorithm for parameter-agnostic nonconvex minimax optimization

Amortized proximal optimization

Towards noise-adaptive, problem-adaptive (accelerated) stochastic gradient descent

Why line search when you can plane search? so-friendly neural networks allow per-iteration optimization of learning and momentum rates for every layer

SVRG meets adagrad: Painless variance reduction

Target-based surrogates for stochastic optimization

引用