Conservative offline distributional reinforcement learning

Y Ma, D Jayaraman, O Bastani - Advances in neural …, 2021 - proceedings.neurips.cc
… In many applications of reinforcement learning, actively gathering data through interactions
… Offline (or batch) reinforcement learning (RL) avoids this problem by learning a policy solely …

Internally rewarded reinforcement learning

M Li, X Zhao, JH Lee, C Weber… - … on Machine Learning, 2023 - proceedings.mlr.press
… a class of reinforcement learning problems where the reward signals for policy learning are
… To this end, in the following subsections, we theoretically and empirically analyze Eq. (13) …

Efficient offline reinforcement learning with relaxed conservatism

L Huang, B Dong, W Zhang - … Transactions on Pattern Analysis …, 2024 - ieeexplore.ieee.org
… Abstract—Offline reinforcement learning (RL) aims at learning an optimal policy from a
static offline data set, without interacting with the environment. However, the theoretical …

An Empirical Investigation of Transfer Effects for Reinforcement Learning

JS Jwo, CS Lin, CH Lee, YC Lo - Computational Intelligence …, 2020 - Wiley Online Library
learning, we employ Q-learning as the base of the reinforcement learning algorithm, apply
the sorting … reinforcement learning will be similar when they both reach a similar training level. …

Deep reinforcement learning that matters

P Henderson, R Islam, P Bachman, J Pineau… - Proceedings of the …, 2018 - ojs.aaai.org
… In this section we analyze some of the evaluation metrics commonly used in the reinforcement
learning literature. In practice, RL algorithms are often evaluated by simply presenting …

Empirical evaluation of activation functions and kernel initializers on deep reinforcement learning

S Jang, Y Son - 2019 International Conference on Information …, 2019 - ieeexplore.ieee.org
analyze the effect of the activation function and the kernel initializer on the performance of
deep reinforcement learningEmpirical analyses are shown in Section III. Section IV concludes …

An analysis of switchback designs in reinforcement learning

Q Wen, C Shi, Y Yang, N Tang, H Zhu - arXiv preprint arXiv:2403.17285, 2024 - arxiv.org
… Our analysis accommodates a variety of policy value … difference learning estimators, and
double reinforcement learning … design strategies for policy evaluation in reinforcement learning. …

Surprise-based intrinsic motivation for deep reinforcement learning

J Achiam, S Sastry - arXiv preprint arXiv:1703.01732, 2017 - arxiv.org
… where r(s, a, s ) is the original reward and r (s, a, s ) is the transformed reward, so ideally
we could solve (2) by applying any reinforcement learning algorithm with these reshaped …

Beyond dichotomies in reinforcement learning

AGE Collins, J Cockburn - Nature Reviews Neuroscience, 2020 - nature.com
Reinforcement learning (RL) is a framework of particular importance to psychology,
neuroscience and machine learning. Interactions between these fields, as promoted through the …

Using deep reinforcement learning for exploratory performance testing of software systems with multi-dimensional input spaces

T Ahmad, A Ashraf, D Truscan, A Domi, I Porres - IEEE Access, 2020 - ieeexplore.ieee.org
… In our first empirical analysis, we only changed the maximum negative reward per episode
(MaxNegR) parameter to -10, -100, -5000, and -100 000. We selected these values to …