Acceleration by Stepsize Hedging: Multi-Step Descent and the Silver Stepsize Schedule

J Altschuler, P Parrilo - Journal of the ACM, 2023 - dl.acm.org
Can we accelerate the convergence of gradient descent without changing the algorithm—
just by judiciously choosing stepsizes? Surprisingly, we show that the answer is yes. Our …

Provably faster gradient descent via long steps

B Grimmer - SIAM Journal on Optimization, 2024 - SIAM
This work establishes new convergence guarantees for gradient descent in smooth convex
optimization via a computer-assisted analysis technique. Our theory allows nonconstant …

Acceleration by stepsize hedging: Silver Stepsize Schedule for smooth convex optimization

JM Altschuler, PA Parrilo - Mathematical Programming, 2024 - Springer
We provide a concise, self-contained proof that the Silver Stepsize Schedule proposed in
our companion paper directly applies to smooth (non-strongly) convex optimization …

Accelerated objective gap and gradient norm convergence for gradient descent via long steps

B Grimmer, K Shu, AL Wang - arXiv preprint arXiv:2403.14045, 2024 - arxiv.org
This work considers gradient descent for L-smooth convex optimization with stepsizes larger
than the classic regime where descent can be ensured. The stepsize schedules considered …

Accelerated gradient descent via long steps

B Grimmer, K Shu, AL Wang - arXiv preprint arXiv:2309.09961, 2023 - arxiv.org
Recently Grimmer [1] showed for smooth convex optimization by utilizing longer steps
periodically, gradient descent's state-of-the-art O (1/T) convergence guarantees can be …

Good regularity creates large learning rate implicit biases: edge of stability, balancing, and catapult

Y Wang, Z Xu, T Zhao, M Tao - arXiv preprint arXiv:2310.17087, 2023 - arxiv.org
Large learning rates, when applied to gradient descent for nonconvex optimization, yield
various implicit biases including the edge of stability (Cohen et al., 2021), balancing (Wang …

Accelerated gradient descent by concatenation of stepsize schedules

Z Zhang, R Jiang - arXiv preprint arXiv:2410.12395, 2024 - arxiv.org
This work considers stepsize schedules for gradient descent on smooth convex objectives.
We extend the existing literature and propose a unified technique for constructing stepsizes …

Relaxed proximal point algorithm: Tight complexity bounds and acceleration without momentum

B Wang, S Ma, J Yang, D Zhou - arXiv preprint arXiv:2410.08890, 2024 - arxiv.org
In this paper, we focus on the relaxed proximal point algorithm (RPPA) for solving convex
(possibly nonsmooth) optimization problems. We conduct a comprehensive study on three …

Anytime Acceleration of Gradient Descent

Z Zhang, JD Lee, SS Du, Y Chen - arXiv preprint arXiv:2411.17668, 2024 - arxiv.org
This work investigates stepsize-based acceleration of gradient descent with {\em anytime}
convergence guarantees. For smooth (non-strongly) convex optimization, we propose a …

Gradient descent with adaptive stepsize converges (nearly) linearly under fourth-order growth

D Davis, D Drusvyatskiy, L Jiang - arXiv preprint arXiv:2409.19791, 2024 - arxiv.org
A prevalent belief among optimization specialists is that linear convergence of gradient
descent is contingent on the function growing quadratically away from its minimizers. In this …