A finite-time analysis of Q-learning with neural network function approximation

P Xu, Q Gu - International Conference on Machine Learning, 2020 - proceedings.mlr.press
Q-learning with neural network function approximation (neural Q-learning for short) is
among the most prevalent deep reinforcement learning algorithms. Despite its empirical …

A new convergent variant of Q-learning with linear function approximation

D Carvalho, FS Melo, P Santos - Advances in Neural …, 2020 - proceedings.neurips.cc
In this work, we identify a novel set of conditions that ensure convergence with probability 1
of Q-learning with linear function approximation, by proposing a two time-scale variation …

A theoretical analysis of deep Q-learning

J Fan, Z Wang, Y Xie, Z Yang - Learning for dynamics and …, 2020 - proceedings.mlr.press
Despite the great empirical success of deep reinforcement learning, its theoretical
foundation is less well understood. In this work, we make the first attempt to theoretically …

Provably efficient Q-learning with function approximation via distribution shift error checking oracle

SS Du, Y Luo, R Wang… - Advances in Neural …, 2019 - proceedings.neurips.cc
Q-learning with function approximation is one of the most popular methods in reinforcement
learning. Though the idea of using function approximation was proposed at least 60 years …

Diagnosing bottlenecks in deep q-learning algorithms

J Fu, A Kumar, M Soh, S Levine - … Conference on Machine …, 2019 - proceedings.mlr.press
Q-learning methods are a common class of algorithms used in reinforcement learning (RL).
However, their behavior with function approximation, especially with neural networks, is …

[PDF][PDF] Performance of q-learning with linear function approximation: Stability and finite-time analysis

Z Chen, S Zhang, TT Doan, ST Maguluri… - arXiv preprint arXiv …, 2019 - optrl2019.github.io
In this paper, we consider the model-free reinforcement learning problem and study the
popular Q-learning algorithm with linear function approximation for finding the optimal …

Finite-time analysis for double Q-learning

H Xiong, L Zhao, Y Liang… - Advances in neural …, 2020 - proceedings.neurips.cc
Although Q-learning is one of the most successful algorithms for finding the best action-
value function (and thus the optimal policy) in reinforcement learning, its implementation …

Q-learning algorithms: A comprehensive classification and applications

B Jang, M Kim, G Harerimana, JW Kim - IEEE access, 2019 - ieeexplore.ieee.org
Q-learning is arguably one of the most applied representative reinforcement learning
approaches and one of the off-policy strategies. Since the emergence of Q-learning, many …

Using deep q-learning to control optimization hyperparameters

S Hansen - arXiv preprint arXiv:1602.04062, 2016 - arxiv.org
We present a novel definition of the reinforcement learning state, actions and reward
function that allows a deep Q-network (DQN) to learn to control an optimization …

Target-based temporal-difference learning

D Lee, N He - International Conference on Machine …, 2019 - proceedings.mlr.press
The use of target networks has been a popular and key component of recent deep Q-
learning algorithms for reinforcement learning, yet little is known from the theory side. In this …