P Chen, J Chen, Z Hu - Frontiers in Marine Science, 2021 - frontiersin.org
… AI technology discussed in this paper is a reinforcementlearning algorithm named DDPG. An in-house program named DARwind is utilized to run the dynamics response analysis of …
… In this paper, we consider the episodic reinforcementlearning setting in which the agent accesses p and r by interacting with the environment over successive episodes, ie, the agent …
Y Wu, G Tucker, O Nachum - arXiv preprint arXiv:1911.11361, 2019 - arxiv.org
In reinforcementlearning (RL) research, it is common to assume access to direct online interactions with the environment. However in many real-world applications, access to the …
J Chen, N Jiang - … Conference on Machine Learning, 2019 - proceedings.mlr.press
… We are concerned with value-function approximation in batch-mode reinforcement learning, which is related to and sometimes known as Approximate Dynamic Programming (ADP; …
M Jin, J Lavaei - IEEE Access, 2020 - ieeexplore.ieee.org
… problem of certifying stability of reinforcementlearning policies when interconnected with … ; furthermore, we analyze and establish its (non)conservatism. Empirical evaluations on two …
… In Section 4 we present a practical reinforcementlearning (RL) algorithm for implementing an SR system with an application to ad offers. In Section 5 we present an algorithm for safely …
… After introducing our approach and providing a finite-sample analysis, we empirically evaluate REMPS on both benchmark and realistic environments by comparing our results with …
… Next, we demonstrate how to solve for the optimal attack problem in practice, and empirically show that with the techniques from Deep ReinforcementLearning (DRL), we can find …
… Off-policy multi-step reinforcementlearning algorithms consist of conservative and non… Motivated by the empirical results and the lack of theory, we carry out theoretical analyses of Peng’…