Deep reinforcement learning at the edge of the statistical precipice

R Agarwal, M Schwarzer, PS Castro… - Advances in neural …, 2021 - proceedings.neurips.cc
Deep reinforcement learning (RL) algorithms are predominantly evaluated by comparing
their relative performance on a large suite of tasks. Most published results on deep RL …

Decision transformer: Reinforcement learning via sequence modeling

L Chen, K Lu, A Rajeswaran, K Lee… - Advances in neural …, 2021 - proceedings.neurips.cc
We introduce a framework that abstracts Reinforcement Learning (RL) as a sequence
modeling problem. This allows us to draw upon the simplicity and scalability of the …

When do transformers shine in rl? decoupling memory from credit assignment

T Ni, M Ma, B Eysenbach… - Advances in Neural …, 2023 - proceedings.neurips.cc
Reinforcement learning (RL) algorithms face two distinct challenges: learning effective
representations of past and present observations, and determining how actions influence …

Recurrent model-free rl can be a strong baseline for many pomdps

T Ni, B Eysenbach, R Salakhutdinov - arXiv preprint arXiv:2110.05038, 2021 - arxiv.org
Many problems in RL, such as meta-RL, robust RL, generalization in RL, and temporal credit
assignment, can be cast as POMDPs. In theory, simply augmenting model-free RL with …

Desiderata for normative models of synaptic plasticity

C Bredenberg, C Savin - Neural Computation, 2024 - direct.mit.edu
Normative models of synaptic plasticity use computational rationales to arrive at predictions
of behavioral and network-level adaptive phenomena. In recent years, there has been an …

A neural network model of when to retrieve and encode episodic memories

Q Lu, U Hasson, KA Norman - elife, 2022 - elifesciences.org
Recent human behavioral and neuroimaging results suggest that people are selective in
when they encode and retrieve episodic memories. To explain these findings, we trained a …

Would I have gotten that reward? Long-term credit assignment by counterfactual contribution analysis

A Meulemans, S Schug… - Advances in Neural …, 2024 - proceedings.neurips.cc
To make reinforcement learning more sample efficient, we need better credit assignment
methods that measure an action's influence on future rewards. Building upon Hindsight …

Decision s4: Efficient sequence-based rl via state spaces layers

S Bar-David, I Zimerman, E Nachmani… - arXiv preprint arXiv …, 2023 - arxiv.org
Recently, sequence learning methods have been applied to the problem of off-policy
Reinforcement Learning, including the seminal work on Decision Transformers, which …

A survey of temporal credit assignment in deep reinforcement learning

E Pignatelli, J Ferret, M Geist, T Mesnard… - arXiv preprint arXiv …, 2023 - arxiv.org
The Credit Assignment Problem (CAP) refers to the longstanding challenge of
Reinforcement Learning (RL) agents to associate actions with their long-term …

Interpretable concept bottlenecks to align reinforcement learning agents

Q Delfosse, S Sztwiertnia, M Rothermel… - arXiv preprint arXiv …, 2024 - arxiv.org
Goal misalignment, reward sparsity and difficult credit assignment are only a few of the many
issues that make it difficult for deep reinforcement learning (RL) agents to learn optimal …