In this paper, we introduce pilco, a practical, data-efficient model-based policy search method. Pilco reduces model bias, one of the key problems of model-based reinforcement …
The significantly expanded and updated new edition of a widely used text on reinforcement learning, one of the most active research areas in artificial intelligence. Reinforcement …
Most learning algorithms are not invariant to the scale of the signal that is being approximated. We propose to adaptively normalize the targets used in the learning updates …
H Van Hasselt - Reinforcement Learning: State-of-the-Art, 2012 - Springer
Many traditional reinforcement-learning algorithms have been designed for problems with small finite state and action spaces. Learning in such discrete problems can been difficult …
Owe to the recent advancements in Artificial Intelligence especially deep learning, many data-driven decision support systems have been implemented to facilitate medical doctors in …
IS Comşa, S Zhang, ME Aydin… - … on Network and …, 2018 - ieeexplore.ieee.org
Dominated by delay-sensitive and massive data applications, radio resource management in 5G access networks is expected to satisfy very stringent delay and packet loss …
Q-learning is a popular reinforcement learning algorithm, but it can perform poorly in stochastic environments due to overestimating action values. Overestimation is due to the …
The overestimation caused by function approximation is a well-known property in Q-learning algorithms, especially in single-critic models, which leads to poor performance in practical …
M White - International Conference on Machine Learning, 2017 - proceedings.mlr.press
Reinforcement learning tasks are typically specified as Markov decision processes. This formalism has been highly successful, though specifications often couple the dynamics of …