This monograph covers some recent advances in a range of acceleration techniques frequently used in convex optimization. We first use quadratic optimization problems to …
We analyze (stochastic) gradient descent (SGD) with delayed updates on smooth quasi- convex and non-convex functions and derive concise, non-asymptotic, convergence rates …
We analyze (stochastic) gradient descent (SGD) with delayed updates on smooth quasi- convex and non-convex functions and derive concise, non-asymptotic, convergence rates …
In this paper, we consider nonconvex minimax optimization, which is gaining prominence in many modern machine learning applications, such as GANs. Large-scale edge-based …
Motivated by recent increased interest in optimization algorithms for non-convex optimization in application to training deep neural networks and other optimization problems …
Abstract Stochastic Gradient Descent (SGD) is being used routinely for optimizing non- convex functions. Yet, the standard convergence theory for SGD in the smooth non-convex …
K Kim, K Wu, J Oh, JR Gardner - … Conference on Machine …, 2023 - proceedings.mlr.press
Understanding the gradient variance of black-box variational inference (BBVI) is a crucial step for establishing its convergence and developing algorithmic improvements. However …
Modern machine learning paradigms, such as deep learning, occur in or close to the interpolation regime, wherein the number of model parameters is much larger than the …
In this paper we provide an O (m loglog O (1) n log (1/ϵ))-expected time algorithm for solving Laplacian systems on n-node m-edge graphs, improving upon the previous best expected …