Control interpretations for first-order optimization methods

OA Somefun, K Akingbade, F Dahunsi - Annual Reviews in Control, 2021 - Elsevier

A lot of automatic feedback control and learning tasks carried out on many dynamical
systems still fundamentally rely on a form of proportional–integral–derivative (PID) control …

被引用次数：153 相关文章

[PDF] mlr.press

Responsive safety in reinforcement learning by pid lagrangian methods

A Stooke, J Achiam, P Abbeel - International Conference on …, 2020 - proceedings.mlr.press

Lagrangian methods are widely used algorithms for constrained optimization problems, but
their learning dynamics exhibit oscillations and overshoot which, when applied to safe …

被引用次数：315 相关文章所有 5 个版本

[PDF] springer.com

Understanding the acceleration phenomenon via high-resolution differential equations

B Shi, SS Du, MI Jordan, WJ Su - Mathematical Programming, 2022 - Springer

Gradient-based optimization algorithms can be studied from the perspective of limiting
ordinary differential equations (ODEs). Motivated by the fact that existing ODEs do not …

被引用次数：287 相关文章所有 14 个版本

[PDF] arxiv.org

Analysis of optimization algorithms via integral quadratic constraints: Nonstrongly convex problems

M Fazlyab, A Ribeiro, M Morari, VM Preciado - SIAM Journal on Optimization, 2018 - SIAM

In this paper, we develop a unified framework capable of certifying both exponential and
subexponential convergence rates for a wide range of iterative first-order optimization …

被引用次数：132 相关文章所有 5 个版本

[PDF] mlr.press

On acceleration with noise-corrupted gradients

M Cohen, J Diakonikolas… - … Conference on Machine …, 2018 - proceedings.mlr.press

Accelerated algorithms have broad applications in large-scale optimization, due to their
generality and fast convergence. However, their stability in the practical setting of noise …

被引用次数：105 相关文章所有 5 个版本

[PDF] nature.com

Accelerated optimization in deep learning with a proportional-integral-derivative controller

S Chen, J Liu, P Wang, C Xu, S Cai, J Chu - Nature Communications, 2024 - nature.com

High-performance optimization algorithms are essential in deep learning. However,
understanding the behavior of optimization (ie, learning process) remains challenging due …

被引用次数：1 相关文章所有 7 个版本

[PDF] arxiv.org

Deep learning theory review: An optimal control and dynamical systems perspective

GH Liu, EA Theodorou - arXiv preprint arXiv:1908.10920, 2019 - arxiv.org

Attempts from different disciplines to provide a fundamental understanding of deep learning
have advanced rapidly in recent years, yet a unified framework remains relatively limited. In …

被引用次数：81 相关文章所有 2 个版本

[PDF] neurips.cc

Characterizing the exact behaviors of temporal difference learning algorithms using Markov jump linear system theory

B Hu, U Syed - Advances in neural information processing …, 2019 - proceedings.neurips.cc

In this paper, we provide a unified analysis of temporal difference learning algorithms with
linear function approximators by exploiting their connections to Markov jump linear systems …

被引用次数：69 相关文章所有 9 个版本

[PDF] arxiv.org

Generalized momentum-based methods: A Hamiltonian perspective

J Diakonikolas, MI Jordan - SIAM Journal on Optimization, 2021 - SIAM

We take a Hamiltonian-based perspective to generalize Nesterov's accelerated gradient
descent and Polyak's heavy ball method to a broad class of momentum methods in the …

被引用次数：62 相关文章所有 6 个版本

[PDF] nsf.gov

Potential function-based framework for minimizing gradients in convex and min-max optimization

J Diakonikolas, P Wang - SIAM Journal on Optimization, 2022 - SIAM

Making the gradients small is a fundamental optimization problem that has eluded unifying
and simple convergence arguments in first-order optimization, so far primarily reserved for …

被引用次数：19 相关文章所有 2 个版本

高级搜索

QQ 群