相关文章- 学术资源搜索

Interpretable reward redistribution in reinforcement learning: a causal approach

Y Zhang, Y Du, B Huang, Z Wang… - Advances in …, 2024 - proceedings.neurips.cc

A major challenge in reinforcement learning is to determine which state-action pairs are
responsible for future rewards that are delayed. Reward redistribution serves as a solution to …

被引用次数：4 相关文章所有 9 个版本

[PDF] arxiv.org

A survey on causal reinforcement learning

Y Zeng, R Cai, F Sun, L Huang, Z Hao - arXiv preprint arXiv:2302.05209, 2023 - arxiv.org

While Reinforcement Learning (RL) achieves tremendous success in sequential decision-
making problems of many domains, it still faces key challenges of data inefficiency and the …

被引用次数：11 相关文章所有 2 个版本

[PDF] openreview.net

Exploration-driven representation learning in reinforcement learning

A Erraqabi, H Zhao, MC Machado… - ICML 2021 Workshop …, 2021 - openreview.net

Learning reward-agnostic representations is an emerging paradigm in reinforcement
learning. These representations can be leveraged for several purposes ranging from reward …

被引用次数：12 相关文章

[PDF] mlr.press

Distributional multivariate policy evaluation and exploration with the bellman gan

D Freirich, T Shimkin, R Meir… - … Conference on Machine …, 2019 - proceedings.mlr.press

The recently proposed distributional approach to reinforcement learning (DiRL) is centered
on learning the distribution of the reward-to-go, often referred to as the value distribution. In …

被引用次数：16 相关文章所有 4 个版本

[PDF] arxiv.org

Diffusion models for reinforcement learning: A survey

Z Zhu, H Zhao, H He, Y Zhong, S Zhang, Y Yu… - arXiv preprint arXiv …, 2023 - arxiv.org

Diffusion models have emerged as a prominent class of generative models, surpassing
previous methods regarding sample quality and training stability. Recent works have shown …

被引用次数：13 相关文章所有 2 个版本

[PDF] arxiv.org

Counterfactual credit assignment in model-free reinforcement learning

T Mesnard, T Weber, F Viola, S Thakoor… - arXiv preprint arXiv …, 2020 - arxiv.org

Credit assignment in reinforcement learning is the problem of measuring an action's
influence on future rewards. In particular, this requires separating skill from luck, ie …

被引用次数：65 相关文章所有 6 个版本

[PDF] arxiv.org

Learning long-term reward redistribution via randomized return decomposition

Z Ren, R Guo, Y Zhou, J Peng - arXiv preprint arXiv:2111.13485, 2021 - arxiv.org

Many practical applications of reinforcement learning require agents to learn from sparse
and delayed rewards. It challenges the ability of agents to attribute their actions to future …

被引用次数：25 相关文章所有 7 个版本

[PDF] neurips.cc

Distributional reward decomposition for reinforcement learning

Z Lin, L Zhao, D Yang, T Qin… - Advances in neural …, 2019 - proceedings.neurips.cc

Many reinforcement learning (RL) tasks have specific properties that can be leveraged to
modify existing RL algorithms to adapt to those tasks and further improve performance, and …

被引用次数：17 相关文章所有 9 个版本

[PDF] arxiv.org

Causal reinforcement learning: A survey

Z Deng, J Jiang, G Long, C Zhang - arXiv preprint arXiv:2307.01452, 2023 - arxiv.org

Reinforcement learning is an essential paradigm for solving sequential decision problems
under uncertainty. Despite many remarkable achievements in recent decades, applying …

被引用次数：8 相关文章所有 5 个版本

[PDF] arxiv.org

Constructing a good behavior basis for transfer using generalized policy updates

S Alver, D Precup - arXiv preprint arXiv:2112.15025, 2021 - arxiv.org

We study the problem of learning a good set of policies, so that when combined together,
they can solve a wide variety of unseen reinforcement learning tasks with no or very little …

被引用次数：18 相关文章所有 3 个版本

高级搜索

QQ 群

Interpretable reward redistribution in reinforcement learning: a causal approach

A survey on causal reinforcement learning

Exploration-driven representation learning in reinforcement learning

Distributional multivariate policy evaluation and exploration with the bellman gan

Diffusion models for reinforcement learning: A survey

Counterfactual credit assignment in model-free reinforcement learning

Learning long-term reward redistribution via randomized return decomposition

Distributional reward decomposition for reinforcement learning

Causal reinforcement learning: A survey

Constructing a good behavior basis for transfer using generalized policy updates

相关搜索

引用