Abstract Coronavirus Disease 2019 (COVID-19) has spread aggressively across the world causing an existential health crisis. Thus, having a system that automatically detects COVID …
O Sebbouh, RM Gower… - Conference on Learning …, 2021 - proceedings.mlr.press
We study stochastic gradient descent (SGD) and the stochastic heavy ball method (SHB, otherwise known as the momentum method) for the general stochastic approximation …
Abstract Recently Loizou et al.(2021), proposed and analyzed stochastic gradient descent (SGD) with stochastic Polyak stepsize (SPS). The proposed SPS comes with strong …
J Yang, X Li, N He - Advances in Neural Information …, 2022 - proceedings.neurips.cc
Adaptive algorithms like AdaGrad and AMSGrad are successful in nonconvex optimization owing to their parameter-agnostic ability–requiring no a priori knowledge about problem …
We propose a framework for online meta-optimization of parameters that govern optimization, called Amortized Proximal Optimization (APO). We first interpret various …
We aim to make stochastic gradient descent (SGD) adaptive to (i) the noise $\sigma^ 2$ in the stochastic gradients and (ii) problem-dependent constants. When minimizing smooth …
B Shea, M Schmidt - arXiv preprint arXiv:2406.17954, 2024 - arxiv.org
We introduce the class of SO-friendly neural networks, which include several models used in practice including networks with 2 layers of hidden weights where the number of inputs is …
Variance reduction (VR) methods for finite-sum minimization typically require the knowledge of problem-dependent constants that are often unknown and difficult to estimate. To address …
We consider minimizing functions for which it is expensive to compute the (possibly stochastic) gradient. Such functions are prevalent in reinforcement learning, imitation …