Greed, hedging, and acceleration in convex optimization

J Altschuler, P Parrilo - Journal of the ACM, 2023 - dl.acm.org

Can we accelerate the convergence of gradient descent without changing the algorithm—
just by judiciously choosing stepsizes? Surprisingly, we show that the answer is yes. Our …

被引用次数：30 相关文章所有 2 个版本

[PDF] arxiv.org

Provably faster gradient descent via long steps

B Grimmer - SIAM Journal on Optimization, 2024 - SIAM

This work establishes new convergence guarantees for gradient descent in smooth convex
optimization via a computer-assisted analysis technique. Our theory allows nonconstant …

被引用次数：36 相关文章所有 4 个版本

[PDF] springer.com

Acceleration by stepsize hedging: Silver Stepsize Schedule for smooth convex optimization

JM Altschuler, PA Parrilo - Mathematical Programming, 2024 - Springer

We provide a concise, self-contained proof that the Silver Stepsize Schedule proposed in
our companion paper directly applies to smooth (non-strongly) convex optimization …

被引用次数：27 相关文章所有 2 个版本

[PDF] arxiv.org

Accelerated objective gap and gradient norm convergence for gradient descent via long steps

B Grimmer, K Shu, AL Wang - arXiv preprint arXiv:2403.14045, 2024 - arxiv.org

This work considers gradient descent for L-smooth convex optimization with stepsizes larger
than the classic regime where descent can be ensured. The stepsize schedules considered …

被引用次数：9 相关文章所有 2 个版本

[PDF] arxiv.org

Accelerated gradient descent via long steps

B Grimmer, K Shu, AL Wang - arXiv preprint arXiv:2309.09961, 2023 - arxiv.org

Recently Grimmer [1] showed for smooth convex optimization by utilizing longer steps
periodically, gradient descent's state-of-the-art O (1/T) convergence guarantees can be …

被引用次数：21 相关文章所有 3 个版本

[PDF] arxiv.org

Good regularity creates large learning rate implicit biases: edge of stability, balancing, and catapult

Y Wang, Z Xu, T Zhao, M Tao - arXiv preprint arXiv:2310.17087, 2023 - arxiv.org

Large learning rates, when applied to gradient descent for nonconvex optimization, yield
various implicit biases including the edge of stability (Cohen et al., 2021), balancing (Wang …

被引用次数：7 相关文章所有 3 个版本

[PDF] arxiv.org

Accelerated gradient descent by concatenation of stepsize schedules

Z Zhang, R Jiang - arXiv preprint arXiv:2410.12395, 2024 - arxiv.org

This work considers stepsize schedules for gradient descent on smooth convex objectives.
We extend the existing literature and propose a unified technique for constructing stepsizes …

被引用次数：3 相关文章所有 2 个版本

[PDF] arxiv.org

Relaxed proximal point algorithm: Tight complexity bounds and acceleration without momentum

B Wang, S Ma, J Yang, D Zhou - arXiv preprint arXiv:2410.08890, 2024 - arxiv.org

In this paper, we focus on the relaxed proximal point algorithm (RPPA) for solving convex
(possibly nonsmooth) optimization problems. We conduct a comprehensive study on three …

被引用次数：2 相关文章所有 3 个版本

[PDF] arxiv.org

Anytime Acceleration of Gradient Descent

Z Zhang, JD Lee, SS Du, Y Chen - arXiv preprint arXiv:2411.17668, 2024 - arxiv.org

This work investigates stepsize-based acceleration of gradient descent with {\em anytime}
convergence guarantees. For smooth (non-strongly) convex optimization, we propose a …

被引用次数：1 相关文章所有 3 个版本

[PDF] arxiv.org

Gradient descent with adaptive stepsize converges (nearly) linearly under fourth-order growth

D Davis, D Drusvyatskiy, L Jiang - arXiv preprint arXiv:2409.19791, 2024 - arxiv.org

A prevalent belief among optimization specialists is that linear convergence of gradient
descent is contingent on the function growing quadratically away from its minimizers. In this …

被引用次数：1 相关文章所有 2 个版本

高级搜索

QQ 群