相关文章- 学术资源搜索

Steady state analysis of episodic reinforcement learning

H Bojun - Advances in Neural Information Processing …, 2020 - proceedings.neurips.cc

Reinforcement Learning (RL) tasks generally divide into two kinds: continual learning and
episodic learning. The concept of steady state has played a foundational role in the …

被引用次数：22 相关文章所有 5 个版本

[PDF] neurips.cc

A unifying view of optimism in episodic reinforcement learning

G Neu, C Pike-Burke - Advances in Neural Information …, 2020 - proceedings.neurips.cc

The principle of``optimism in the face of uncertainty''underpins many theoretically successful
reinforcement learning algorithms. In this paper we provide a general framework for …

被引用次数：76 相关文章所有 11 个版本

[PDF] mlr.press

Optimism and delays in episodic reinforcement learning

B Howson, C Pike-Burke… - … Conference on Artificial …, 2023 - proceedings.mlr.press

There are many algorithms for regret minimisation in episodic reinforcement learning. This
problem is well-understood from a theoretical perspective, providing that the sequences of …

被引用次数：4 相关文章所有 3 个版本

[PDF] arxiv.org

Sequence modeling of temporal credit assignment for episodic reinforcement learning

Y Liu, Y Luo, Y Zhong, X Chen, Q Liu… - arXiv preprint arXiv …, 2019 - arxiv.org

Recent advances in deep reinforcement learning algorithms have shown great potential and
success for solving many challenging real-world problems, including Go game and robotic …

被引用次数：43 相关文章所有 2 个版本

[PDF] kent.ac.uk

Reward shaping in episodic reinforcement learning

M Grzes - 2017 - kar.kent.ac.uk

Recent advancements in reinforcement learning confirm that reinforcement learning
techniques can solve large scale problems leading to high quality autonomous decision …

被引用次数：152 相关文章所有 11 个版本

[PDF] arxiv.org

Continuous episodic control

Z Yang, TM Moerland, M Preuss… - 2023 IEEE Conference …, 2023 - ieeexplore.ieee.org

Non-parametric episodic memory can be used to quickly latch onto high-rewarded
experience in reinforcement learning tasks. In contrast to parametric deep reinforcement …

被引用次数：3 相关文章所有 4 个版本

[PDF] mlr.press

Detecting rewards deterioration in episodic reinforcement learning

I Greenberg, S Mannor - International Conference on …, 2021 - proceedings.mlr.press

In many RL applications, once training ends, it is vital to detect any deterioration in the agent
performance as soon as possible. Furthermore, it often has to be done without modifying the …

被引用次数：12 相关文章所有 4 个版本

[PDF] biorxiv.org

Episodic control as meta-reinforcement learning

S Ritter, JX Wang, Z Kurth-Nelson, M Botvinick - BioRxiv, 2018 - biorxiv.org

Recent research has placed episodic reinforcement learning (RL) alongside model-free and
model-based RL on the list of processes centrally involved in human reward-based learning …

被引用次数：19 相关文章所有 5 个版本

[PDF] jair.org

Towards continual reinforcement learning: A review and perspectives

K Khetarpal, M Riemer, I Rish, D Precup - Journal of Artificial Intelligence …, 2022 - jair.org

In this article, we aim to provide a literature review of different formulations and approaches
to continual reinforcement learning (RL), also known as lifelong or non-stationary RL. We …

被引用次数：291 相关文章所有 9 个版本

[PDF] aaai.org

Theoretical guarantees of fictitious discount algorithms for episodic reinforcement learning and global convergence of policy gradient methods

X Guo, A Hu, J Zhang - Proceedings of the AAAI Conference on …, 2022 - ojs.aaai.org

When designing algorithms for finite-time-horizon episodic reinforcement learning problems,
a common approach is to introduce a fictitious discount factor and use stationary policies for …

被引用次数：5 相关文章所有 5 个版本

高级搜索

QQ 群