On the optimization landscape of dynamic output feedback linear quadratic control

B Hu, K Zhang, N Li, M Mesbahi… - Annual Review of …, 2023 - annualreviews.org

Gradient-based methods have been widely used for system design and optimization in
diverse application domains. Recently, there has been a renewed interest in studying …

被引用次数：63 相关文章所有 6 个版本

[PDF] neurips.cc

Global Convergence of Direct Policy Search for State-Feedback Robust Control: A Revisit of Nonsmooth Synthesis with Goldstein Subdifferential

X Guo, B Hu - Advances in Neural Information Processing …, 2022 - proceedings.neurips.cc

Direct policy search has been widely applied in modern reinforcement learning and
continuous control. However, the theoretical properties of direct policy search on nonsmooth …

被引用次数：12 相关文章所有 8 个版本

[PDF] researchgate.net

Rl-driven mppi: Accelerating online control laws calculation with offline policy

Y Qu, H Chu, S Gao, J Guan, H Yan… - IEEE Transactions …, 2023 - ieeexplore.ieee.org

Model Predictive Path Integral (MPPI) is a recognized sampling-based approach for finite
horizon optimal control problems. However, the efficacy and computational efficiency of …

被引用次数：7 相关文章所有 2 个版本

[PDF] arxiv.org

Escaping high-order saddles in policy optimization for Linear Quadratic Gaussian (LQG) control

Y Zheng, Y Sun, M Fazel, N Li - 2022 IEEE 61st Conference on …, 2022 - ieeexplore.ieee.org

First-order policy optimization has been widely used in reinforcement learning. It guarantees
to find the optimal policy for the state-feedback linear quadratic regulator (LQR). However …

被引用次数：16 相关文章所有 7 个版本

[PDF] mlr.press

Controlgym: Large-scale control environments for benchmarking reinforcement learning algorithms

X Zhang, W Mao, S Mowlavi… - 6th Annual Learning …, 2024 - proceedings.mlr.press

We introduce controlgym, a library of thirty-six industrial control settings, and ten infinite-
dimensional partial differential equation (PDE)-based control problems. Integrated within the …

被引用次数：3 相关文章所有 2 个版本

Sliding-Mode Control for Perturbed MIMO Systems With Time-Synchronized Convergence

W Jiang, SS Ge, Q Hu, D Li - IEEE Transactions on Cybernetics, 2023 - ieeexplore.ieee.org

This article introduces a novel approach called terminal sliding-mode control for achieving
time-synchronized convergence in multi-input–multi-output (MIMO) systems under …

被引用次数：2 相关文章所有 3 个版本

[PDF] ieee.org

Connectivity of the feasible and sublevel sets of dynamic output feedback control with robustness constraints

B Hu, Y Zheng - IEEE Control Systems Letters, 2022 - ieeexplore.ieee.org

This letter considers the optimization landscape of linear dynamic output feedback control
with robustness constraints. We consider the feasible set of all the stabilizing full-order …

被引用次数：15 相关文章所有 7 个版本

Policy gradient methods for designing dynamic output feedback controllers

T Sadamoto, T Hirai - European Journal of Control, 2024 - Elsevier

This paper proposes model-based and model-free policy gradient methods (PGMs) for
designing dynamic output feedback controllers for discrete-time partially observable …

被引用次数：2 相关文章

[PDF] arxiv.org

Mixed policy gradient

Y Guan, J Duan, SE Li, J Li, J Chen… - arXiv preprint arXiv …, 2021 - arxiv.org

Reinforcement learning (RL) has great potential in sequential decision-making. At present,
the mainstream RL algorithms are data-driven, relying on millions of iterations and a large …

被引用次数：21 相关文章所有 3 个版本

[PDF] arxiv.org

Benign nonconvex landscapes in optimal and robust control, Part I: Global optimality

Y Zheng, C Pai, Y Tang - arXiv preprint arXiv:2312.15332, 2023 - arxiv.org

Direct policy search has achieved great empirical success in reinforcement learning. Many
recent studies have revisited its theoretical foundation for continuous control, which reveals …

被引用次数：3 相关文章所有 2 个版本

高级搜索

QQ 群