Making sgd parameter-free

Y Carmon, O Hinder - Conference on Learning Theory, 2022 - proceedings.mlr.press
… for parameter-free SCO are based on online parameter-free regret … is a novel parameter-free
certificate for SGD step size choice, … result that assumes no a-priori bounds on SGD iterates. …

How Free is Parameter-Free Stochastic Optimization?

A Attia, T Koren - arXiv preprint arXiv:2402.03126, 2024 - arxiv.org
SGD results in a fully parameter-free method with convergence rate that matches optimally
tuned SGD up … to design a simple and efficient parameter-free algorithm, provided access to …

Simultaneous model selection and optimization through parameter-free stochastic learning

F Orabona - Advances in Neural Information Processing …, 2014 - proceedings.neurips.cc
… There are two possible ways to use a SGD algorithm: to optimize a batch objective function,
… in turn makes SVMs the method of choice”. Indeed, in this paper we give a parameter-free

Dog is sgd's best friend: A parameter-free dynamic step size schedule

M Ivgi, O Hinder, Y Carmon - International Conference on …, 2023 - proceedings.mlr.press
… We believe that this difference is crucial in making DOG parameter-free, while the above-mentioned
method have a learning-rate parameter to tune (though Bernstein et al. (2020) report …

Tuning-Free Stochastic Optimization

A Khaled, C Jin - arXiv preprint arXiv:2402.07793, 2024 - arxiv.org
… show tuning-free matching of SGD is possible and achieved by several … Let us make our
setting more specific. We consider the … An algorithm A is called parameter-free if it achieves the …

Better parameter-free stochastic optimization with ODE updates for coin-betting

K Chen, J Langford, F Orabona - … of the AAAI Conference on Artificial …, 2022 - ojs.aaai.org
… better than SGD, AdaGrad, and … make infinitely many infinitesimal SGD updates for each
sampled loss. We provide two improvements: making the approach work for a parameter-free up…

PDP: parameter-free differentiable pruning is all you need

M Cho, S Adya, D Naik - Advances in Neural Information …, 2024 - proceedings.neurips.cc
… w is destined to be pruned for some reason, instead of having a new parameter to denote
"to-prune", PDP lets SGD gradually make w itself smaller relatively against other parameters in …

Dowg unleashed: An efficient universal parameter-free gradient descent method

A Khaled, K Mishchenko, C Jin - Advances in Neural …, 2023 - proceedings.neurips.cc
… requirement makes the method not parameter-free. … The simplest way to make an algorithm
parameter-free is the … DoG is SGD’s best friend: A parameter-free dynamic step size schedule…

Towards Simple and Provable Parameter-Free Adaptive Gradient Methods

Y Tao, H Yuan, X Zhou, Y Cao, Q Gu - arXiv preprint arXiv:2412.19444, 2024 - arxiv.org
… on parameterfree adaptive gradient methods1 remains relatively limited. Specifically, most of
the existing parameter-free algorithms are essentially variants of SGD… This factor makes the …

Problem-Parameter-Free Decentralized Nonconvex Stochastic Optimization

J Li, X Chen, S Ma, M Hong - arXiv preprint arXiv:2402.08821, 2024 - arxiv.org
… It is natural to ask whether one can achieve parameter-free … paper, and in Table 1 we make
the comparison between our … We conduct a simple grid search for D-SGD, D-SGD and D-…