Convergence of recurrent neuro-fuzzy value-gradient learning with and without an actor

S Al-Dabooni, D Wunsch - IEEE Transactions on Fuzzy …, 2019 - ieeexplore.ieee.org
IEEE Transactions on Fuzzy Systems, 2019ieeexplore.ieee.org
In recent years, a gradient of the n-step temporal-difference [TD (λ)] learning has been
developed to present an advanced adaptive dynamic programming (ADP) algorithm, called
value-gradient learning [VGL (λ)]. In this paper, we improve the VGL (λ) architecture, which is
called the “single adaptive actor network [SNVGL (λ)]” because it has only a single
approximator function network (critic) instead of dual networks (critic and actor) as in VGL
(λ). Therefore, SNVGL (λ) has lower computational requirements when compared to VGL (λ) …
In recent years, a gradient of the n-step temporal-difference [TD(λ)] learning has been developed to present an advanced adaptive dynamic programming (ADP) algorithm, called value-gradient learning [VGL(λ)]. In this paper, we improve the VGL(λ) architecture, which is called the “single adaptive actor network [SNVGL(λ)]” because it has only a single approximator function network (critic) instead of dual networks (critic and actor) as in VGL(λ). Therefore, SNVGL(λ) has lower computational requirements when compared to VGL(λ). Moreover, in this paper, a recurrent hybrid neuro-fuzzy (RNF) and a first-order Takagi-Sugeno RNF (TSRNF) are derived and implemented to build the critic and actor networks. Furthermore, we develop the novel study of the theoretical convergence proofs for both VGL(λ) and SNVGL(λ) under certain conditions. In this paper, mobile robot simulation model (model based) is used to solve the optimal control problem for affine nonlinear discrete-time systems. Mobile robot is exposed various noise levels to verify the performance and to validate the theoretical analysis.
ieeexplore.ieee.org
以上显示的是最相近的搜索结果。 查看全部搜索结果

Google学术搜索按钮

example.edu/paper.pdf
搜索
获取 PDF 文件
引用
References