We study time-inhomogeneous episodic reinforcement learning (RL) under general function approximation and sparse rewards. We design a new algorithm, Variance-weighted …
P Hu, Y Chen, L Huang - International Conference on …, 2022 - proceedings.mlr.press
We study reinforcement learning with linear function approximation where the transition probability and reward functions are linear with respect to a feature mapping $\boldsymbol …
J Taupin, Y Jedra, A Proutiere - 2023 59th Annual Allerton …, 2023 - ieeexplore.ieee.org
We consider the problem of best policy identification in discounted Linear Markov Decision Processes in the fixed confidence setting, under both generative and forward models. We …
J Taupin, Y Jedra, A Proutiere - Sixteenth European Workshop on …, 2023 - openreview.net
We consider the problem of best policy identification in discounted Linear Markov Decision Processes in the fixed confidence setting, under both generative and forward models. We …
In this thesis, we investigate the design and statistical efficiency of learning algorithms in systems with a linear structure. This study is carried along three main domains, namely …
Double Q-learning\citep {hasselt2010double} has gained significant success in practice due to its effectiveness in overcoming the overestimation issue of Q-learning. However …