In this work, we identify a novel set of conditions that ensure convergence with probability 1 of Q-learning with linear function approximation, by proposing a two time-scale variation …
J Fan, Z Wang, Y Xie, Z Yang - Learning for dynamics and …, 2020 - proceedings.mlr.press
Despite the great empirical success of deep reinforcement learning, its theoretical foundation is less well understood. In this work, we make the first attempt to theoretically …
SS Du, Y Luo, R Wang… - Advances in Neural …, 2019 - proceedings.neurips.cc
Q-learning with function approximation is one of the most popular methods in reinforcement learning. Though the idea of using function approximation was proposed at least 60 years …
J Fu, A Kumar, M Soh, S Levine - … Conference on Machine …, 2019 - proceedings.mlr.press
Q-learning methods are a common class of algorithms used in reinforcement learning (RL). However, their behavior with function approximation, especially with neural networks, is …
In this paper, we consider the model-free reinforcement learning problem and study the popular Q-learning algorithm with linear function approximation for finding the optimal …
Although Q-learning is one of the most successful algorithms for finding the best action- value function (and thus the optimal policy) in reinforcement learning, its implementation …
Q-learning is arguably one of the most applied representative reinforcement learning approaches and one of the off-policy strategies. Since the emergence of Q-learning, many …
S Hansen - arXiv preprint arXiv:1602.04062, 2016 - arxiv.org
We present a novel definition of the reinforcement learning state, actions and reward function that allows a deep Q-network (DQN) to learn to control an optimization …
D Lee, N He - International Conference on Machine …, 2019 - proceedings.mlr.press
The use of target networks has been a popular and key component of recent deep Q- learning algorithms for reinforcement learning, yet little is known from the theory side. In this …