Recent progress in reinforcement learning and adaptive dynamic programming for advanced control applications

D Wang, N Gao, D Liu, J Li… - IEEE/CAA Journal of …, 2023 - ieeexplore.ieee.org
Reinforcement learning (RL) has roots in dynamic programming and it is called
adaptive/approximate dynamic programming (ADP) within the control community. This paper …

Event-Triggered Control of Nonlinear Discrete-Time System With Unknown Dynamics Based on HDP(λ)

T Li, D Yang, X Xie, H Zhang - IEEE Transactions on …, 2021 - ieeexplore.ieee.org
The heuristic dynamic programming (HDP)()-based optimal control strategy, which takes a
long-term prediction parameter into account using an iterative manner, accelerates the …

Deep deterministic policy gradient with compatible critic network

D Wang, M Hu - IEEE Transactions on Neural Networks and …, 2021 - ieeexplore.ieee.org
Deep deterministic policy gradient (DDPG) is a powerful reinforcement learning algorithm for
large-scale continuous controls. DDPG runs the back-propagation from the state-action …

Robust control of unknown observable nonlinear systems solved as a zero-sum game

MB Radac, T Lala - IEEE Access, 2020 - ieeexplore.ieee.org
An optimal robust control solution for general nonlinear systems with unknown but
observable dynamics is advanced here. The underlying Hamilton-Jacobi-Isaacs (HJI) …

Online Model-Free n-Step HDP With Stability Analysis

S Al-Dabooni, DC Wunsch - IEEE Transactions on Neural …, 2019 - ieeexplore.ieee.org
Because of a powerful temporal-difference (TD) with λ [TD (λ)] learning method, this paper
presents a novel n-step adaptive dynamic programming (ADP) architecture that combines …

An Improved N-Step Value Gradient Learning Adaptive Dynamic Programming Algorithm for Online Learning

S Al-Dabooni, DC Wunsch - IEEE Transactions on Neural …, 2019 - ieeexplore.ieee.org
In problems with complex dynamics and challenging state spaces, the dual heuristic
programming (DHP) algorithm has been shown theoretically and experimentally to perform …

Modified λ-policy iteration based adaptive dynamic programming for unknown discrete-time linear systems

H Jiang, B Zhou, GR Duan - IEEE Transactions on Neural …, 2023 - ieeexplore.ieee.org
In this article, the-policy iteration (-PI) method for the optimal control problem of discrete-time
linear systems is reconsidered and restated from a novel aspect. First, the traditional-PI …

Reinforcement learning control with n-step information for wastewater treatment systems

X Li, D Wang, M Zhao, J Qiao - Engineering Applications of Artificial …, 2024 - Elsevier
Wastewater treatment is important for maintaining a balanced urban ecosystem. To ensure
the success of wastewater treatment, the tracking error between the crucial variable …

Temporal difference learning with multi-step returns for intelligent optimal control of dynamic systems

P Xin, D Wang, A Liu, J Qiao - Neurocomputing, 2025 - Elsevier
In this article, the adaptive neural control algorithm based on the λ-return mechanism [ANC
(λ)] is developed under the temporal difference learning framework to address the optimal …

Online adaptive critic designs with tensor product B-splines and incremental model techniques

Y Feng, Y Zhou, HW Ho, H Dong, X Zhao - Journal of the Franklin Institute, 2024 - Elsevier
As an effective optimal control scheme in the field of reinforcement learning, adaptive
dynamic programming (ADP) has attracted extensive attention in recent decades. Neural …