The effect of multi-step methods on overestimation in deep reinforcement learning

L Meng, R Gorbet, D Kulić - 2020 25th International Conference …, 2021 - ieeexplore.ieee.org
Multi-step (also called n-step) methods in Reinforcement Learning (RL) have been shown to
be more efficient than the 1-step method due to faster propagation of the reward signal, both …

The Effect of Multi-step Methods on Overestimation in Deep Reinforcement Learning

L Meng, R Gorbet, D Kulic - 2020 25th International Conference on …, 2021 - computer.org
Multi-step (also called n-step) methods in Reinforcement Learning (RL) have been shown to
be more efficient than the 1-step method due to faster propagation of the reward signal, both …

The effect of multi-step methods on overestimation in Deep Reinforcement Learning

L Meng, R Gorbet, D Kulić - International Conference on …, 2021 - research.monash.edu
Multi-step (also called n-step) methods in Reinforcement Learning (RL) have been shown to
be more efficient than the 1-step method due to faster propagation of the reward signal, both …

The Effect of Multi-step Methods on Overestimation in Deep Reinforcement Learning

L Meng, R Gorbet, D Kulić - arXiv preprint arXiv:2006.12692, 2020 - arxiv.org
Multi-step (also called n-step) methods in reinforcement learning (RL) have been shown to
be more efficient than the 1-step method due to faster propagation of the reward signal, both …

The Effect of Multi-step Methods on Overestimation in Deep Reinforcement Learning

L Meng, R Gorbet, D Kulić - arXiv e-prints, 2020 - ui.adsabs.harvard.edu
Multi-step (also called n-step) methods in reinforcement learning (RL) have been shown to
be more efficient than the 1-step method due to faster propagation of the reward signal, both …

The Effect of Multi-step Methods on Overestimation in Deep Reinforcement Learning

L Meng, R Gorbet, D Kulić - ailb-web.ing.unimore.it
Multi-step (also called n-step) methods in Reinforcement Learning (RL) have been shown to
be more efficient than the 1-step method due to faster propagation of the reward signal, both …

[PDF][PDF] The Effect of Multi-step Methods on Overestimation in Deep Reinforcement Learning

L Meng, R Gorbet, D Kulic - researchgate.net
Multi-step (also called n-step) methods in Reinforcement Learning (RL) have been shown to
be more efficient than the 1-step method due to faster propagation of the reward signal, both …