Toward a theoretical foundation of policy optimization for learning control policies

B Hu, K Zhang, N Li, M Mesbahi… - Annual Review of …, 2023 - annualreviews.org
Gradient-based methods have been widely used for system design and optimization in
diverse application domains. Recently, there has been a renewed interest in studying …

Global convergence of policy gradient methods for the linear quadratic regulator

M Fazel, R Ge, S Kakade… - … conference on machine …, 2018 - proceedings.mlr.press
Direct policy gradient methods for reinforcement learning and continuous control problems
are a popular approach for a variety of reasons: 1) they are easy to implement without …

Learning optimal controllers for linear systems with multiplicative noise via policy gradient

B Gravell, PM Esfahani… - IEEE Transactions on …, 2020 - ieeexplore.ieee.org
The linear quadratic regulator (LQR) problem has reemerged as an important theoretical
benchmark for reinforcement learning-based control of complex dynamical systems with …

Learning-based control: A tutorial and some recent results

ZP Jiang, T Bian, W Gao - Foundations and Trends® in …, 2020 - nowpublishers.com
This monograph presents a new framework for learning-based control synthesis of
continuous-time dynamical systems with unknown dynamics. The new design paradigm …

Convergence and sample complexity of gradient methods for the model-free linear–quadratic regulator problem

H Mohammadi, A Zare, M Soltanolkotabi… - … on Automatic Control, 2021 - ieeexplore.ieee.org
Model-free reinforcement learning attempts to find an optimal control action for an unknown
dynamical system by directly searching over the parameter space of controllers. The …

A tour of reinforcement learning: The view from continuous control

B Recht - Annual Review of Control, Robotics, and Autonomous …, 2019 - annualreviews.org
This article surveys reinforcement learning from the perspective of optimization and control,
with a focus on continuous control applications. It reviews the general formulation …

Derivative-free policy optimization for linear risk-sensitive and robust control design: Implicit regularization and sample complexity

K Zhang, X Zhang, B Hu… - Advances in neural …, 2021 - proceedings.neurips.cc
Direct policy search serves as one of the workhorses in modern reinforcement learning (RL),
and its applications in continuous control tasks have recently attracted increasing attention …

Analysis of the optimization landscape of linear quadratic gaussian (LQG) control

Y Tang, Y Zheng, N Li - Learning for Dynamics and Control, 2021 - proceedings.mlr.press
This paper revisits the classical Linear Quadratic Gaussian (LQG) control from a modern
optimization perspective. We analyze two aspects of the optimization landscape of the LQG …

Learning convex optimization control policies

A Agrawal, S Barratt, S Boyd… - Learning for Dynamics …, 2020 - proceedings.mlr.press
Many control policies used in applications compute the input or action by solving a convex
optimization problem that depends on the current state and some parameters. Common …

[图书][B] Reinforcement learning for optimal feedback control

R Kamalapurkar, P Walters, J Rosenfeld, W Dixon - 2018 - Springer
Making the best possible decision according to some desired set of criteria is always difficult.
Such decisions are even more difficult when there are time constraints and can be …