Synthetic returns for long-term credit assignment

R Agarwal, M Schwarzer, PS Castro… - Advances in neural …, 2021 - proceedings.neurips.cc

Deep reinforcement learning (RL) algorithms are predominantly evaluated by comparing
their relative performance on a large suite of tasks. Most published results on deep RL …

被引用次数：735 相关文章所有 8 个版本

[PDF] neurips.cc

Decision transformer: Reinforcement learning via sequence modeling

L Chen, K Lu, A Rajeswaran, K Lee… - Advances in neural …, 2021 - proceedings.neurips.cc

We introduce a framework that abstracts Reinforcement Learning (RL) as a sequence
modeling problem. This allows us to draw upon the simplicity and scalability of the …

被引用次数：1753 相关文章所有 11 个版本

[PDF] neurips.cc

When do transformers shine in rl? decoupling memory from credit assignment

T Ni, M Ma, B Eysenbach… - Advances in Neural …, 2023 - proceedings.neurips.cc

Reinforcement learning (RL) algorithms face two distinct challenges: learning effective
representations of past and present observations, and determining how actions influence …

被引用次数：35 相关文章所有 8 个版本

[PDF] arxiv.org

Recurrent model-free rl can be a strong baseline for many pomdps

T Ni, B Eysenbach, R Salakhutdinov - arXiv preprint arXiv:2110.05038, 2021 - arxiv.org

Many problems in RL, such as meta-RL, robust RL, generalization in RL, and temporal credit
assignment, can be cast as POMDPs. In theory, simply augmenting model-free RL with …

被引用次数：120 相关文章所有 4 个版本

[PDF] mit.edu

Desiderata for normative models of synaptic plasticity

C Bredenberg, C Savin - Neural Computation, 2024 - direct.mit.edu

Normative models of synaptic plasticity use computational rationales to arrive at predictions
of behavioral and network-level adaptive phenomena. In recent years, there has been an …

被引用次数：7 相关文章所有 8 个版本

[PDF] elifesciences.org

A neural network model of when to retrieve and encode episodic memories

Q Lu, U Hasson, KA Norman - elife, 2022 - elifesciences.org

Recent human behavioral and neuroimaging results suggest that people are selective in
when they encode and retrieve episodic memories. To explain these findings, we trained a …

被引用次数：61 相关文章所有 11 个版本

[PDF] neurips.cc

Would I have gotten that reward? Long-term credit assignment by counterfactual contribution analysis

A Meulemans, S Schug… - Advances in Neural …, 2024 - proceedings.neurips.cc

To make reinforcement learning more sample efficient, we need better credit assignment
methods that measure an action's influence on future rewards. Building upon Hindsight …

被引用次数：3 相关文章所有 7 个版本

[PDF] arxiv.org

Decision s4: Efficient sequence-based rl via state spaces layers

S Bar-David, I Zimerman, E Nachmani… - arXiv preprint arXiv …, 2023 - arxiv.org

Recently, sequence learning methods have been applied to the problem of off-policy
Reinforcement Learning, including the seminal work on Decision Transformers, which …

被引用次数：26 相关文章所有 4 个版本

[PDF] arxiv.org

A survey of temporal credit assignment in deep reinforcement learning

E Pignatelli, J Ferret, M Geist, T Mesnard… - arXiv preprint arXiv …, 2023 - arxiv.org

The Credit Assignment Problem (CAP) refers to the longstanding challenge of
Reinforcement Learning (RL) agents to associate actions with their long-term …

被引用次数：10 相关文章所有 3 个版本

[PDF] arxiv.org

Interpretable concept bottlenecks to align reinforcement learning agents

Q Delfosse, S Sztwiertnia, M Rothermel… - arXiv preprint arXiv …, 2024 - arxiv.org

Goal misalignment, reward sparsity and difficult credit assignment are only a few of the many
issues that make it difficult for deep reinforcement learning (RL) agents to learn optimal …

被引用次数：12 相关文章所有 3 个版本

高级搜索

QQ 群