Deep reinforcement learning and its neuroscientific implications

M Botvinick, JX Wang, W Dabney, KJ Miller… - Neuron, 2020 - cell.com
The emergence of powerful artificial intelligence (AI) is defining new research directions in
neuroscience. To date, this research has focused largely on deep neural networks trained …

Distributional reinforcement learning in the brain

AS Lowet, Q Zheng, S Matias, J Drugowitsch… - Trends in …, 2020 - cell.com
Learning about rewards and punishments is critical for survival. Classical studies have
demonstrated an impressive correspondence between the firing of dopamine neurons in the …

A survey and critique of multiagent deep reinforcement learning

P Hernandez-Leal, B Kartal, ME Taylor - Autonomous Agents and Multi …, 2019 - Springer
Deep reinforcement learning (RL) has achieved outstanding results in recent years. This has
led to a dramatic increase in the number of applications and methods. Recent works have …

Phasic policy gradient

KW Cobbe, J Hilton, O Klimov… - … on Machine Learning, 2021 - proceedings.mlr.press
Abstract We introduce Phasic Policy Gradient (PPG), a reinforcement learning framework
which modifies traditional on-policy actor-critic methods by separating policy and value …

Deepmdp: Learning continuous latent space models for representation learning

C Gelada, S Kumar, J Buckman… - International …, 2019 - proceedings.mlr.press
Many reinforcement learning (RL) tasks provide the agent with high-dimensional
observations that can be simplified into low-dimensional continuous states. To formalize this …

Conservative offline distributional reinforcement learning

Y Ma, D Jayaraman, O Bastani - Advances in neural …, 2021 - proceedings.neurips.cc
Many reinforcement learning (RL) problems in practice are offline, learning purely from
observational data. A key challenge is how to ensure the learned policy is safe, which …

Revisiting rainbow: Promoting more insightful and inclusive deep reinforcement learning research

JSO Ceron, PS Castro - International Conference on …, 2021 - proceedings.mlr.press
Since the introduction of DQN, a vast majority of reinforcement learning research has
focused on reinforcement learning with deep neural networks as function approximators …

Distributional soft actor-critic: Off-policy reinforcement learning for addressing value estimation errors

J Duan, Y Guan, SE Li, Y Ren, Q Sun… - IEEE transactions on …, 2021 - ieeexplore.ieee.org
In reinforcement learning (RL), function approximation errors are known to easily lead to the-
value overestimations, thus greatly reducing policy performance. This article presents a …

Munchausen reinforcement learning

N Vieillard, O Pietquin, M Geist - Advances in Neural …, 2020 - proceedings.neurips.cc
Bootstrapping is a core mechanism in Reinforcement Learning (RL). Most algorithms, based
on temporal differences, replace the true value of a transiting state by their current estimate …

Exploit reward shifting in value-based deep-rl: Optimistic curiosity-based exploration and conservative exploitation via linear reward shaping

H Sun, L Han, R Yang, X Ma… - Advances in neural …, 2022 - proceedings.neurips.cc
In this work, we study the simple yet universally applicable case of reward shaping in value-
based Deep Reinforcement Learning (DRL). We show that reward shifting in the form of a …