相关文章- 学术资源搜索

Reward shaping in episodic reinforcement learning

M Grzes - 2017 - kar.kent.ac.uk

Recent advancements in reinforcement learning confirm that reinforcement learning
techniques can solve large scale problems leading to high quality autonomous decision …

被引用次数：144 相关文章所有 11 个版本

[PDF] aaai.org

Temporal-logic-based reward shaping for continuing reinforcement learning tasks

Y Jiang, S Bharadwaj, B Wu, R Shah, U Topcu… - Proceedings of the …, 2021 - ojs.aaai.org

In continuing tasks, average-reward reinforcement learning may be a more appropriate
problem formulation than the more common discounted reward formulation. As usual …

被引用次数：51 相关文章所有 13 个版本

[PDF] liv.ac.uk

[PDF][PDF] Learning from demonstration for shaping through inverse reinforcement learning

HB Suay, T Brys, ME Taylor… - Proceedings of the 2016 …, 2016 - aamas.csc.liv.ac.uk

Model-free episodic reinforcement learning problems define the environment reward with
functions that often provide only sparse information throughout the task. Consequently …

被引用次数：99 相关文章所有 6 个版本

[PDF] mlr.press

Near optimal reward-free reinforcement learning

Z Zhang, S Du, X Ji - International Conference on Machine …, 2021 - proceedings.mlr.press

We study the reward-free reinforcement learning framework, which is particularly suitable for
batch reinforcement learning and scenarios where one needs policies for multiple reward …

被引用次数：26 相关文章所有 3 个版本

[PDF] arxiv.org

Learning long-term reward redistribution via randomized return decomposition

Z Ren, R Guo, Y Zhou, J Peng - arXiv preprint arXiv:2111.13485, 2021 - arxiv.org

Many practical applications of reinforcement learning require agents to learn from sparse
and delayed rewards. It challenges the ability of agents to attribute their actions to future …

被引用次数：25 相关文章所有 7 个版本

[PDF] arxiv.org

Autonomous reinforcement learning: Formalism and benchmarking

A Sharma, K Xu, N Sardana, A Gupta… - arXiv preprint arXiv …, 2021 - arxiv.org

Reinforcement learning (RL) provides a naturalistic framing for learning through trial and
error, which is appealing both because of its simplicity and effectiveness and because of its …

被引用次数：26 相关文章所有 3 个版本

[PDF] aaai.org

Belief reward shaping in reinforcement learning

O Marom, B Rosman - Proceedings of the AAAI conference on artificial …, 2018 - ojs.aaai.org

A key challenge in many reinforcement learning problems is delayed rewards, which can
significantly slow down learning. Although reward shaping has previously been introduced …

被引用次数：80 相关文章所有 11 个版本

[PDF] neurips.cc

A unifying view of optimism in episodic reinforcement learning

G Neu, C Pike-Burke - Advances in Neural Information …, 2020 - proceedings.neurips.cc

The principle of``optimism in the face of uncertainty''underpins many theoretically successful
reinforcement learning algorithms. In this paper we provide a general framework for …

被引用次数：73 相关文章所有 11 个版本

[PDF] arxiv.org

A state-distribution matching approach to non-episodic reinforcement learning

A Sharma, R Ahmad, C Finn - arXiv preprint arXiv:2205.05212, 2022 - arxiv.org

While reinforcement learning (RL) provides a framework for learning through trial and error,
translating RL algorithms into the real world has remained challenging. A major hurdle to …

被引用次数：18 相关文章所有 4 个版本

[PDF] illinois.edu

[图书][B] Theory and application of reward shaping in reinforcement learning

AD Laud - 2004 - search.proquest.com

Applying conventional reinforcement to complex domains requires the use of an overly
simplified task model, or a large amount of training experience. This problem results from the …

被引用次数：204 相关文章所有 6 个版本

高级搜索

QQ 群

Reward shaping in episodic reinforcement learning

Temporal-logic-based reward shaping for continuing reinforcement learning tasks

[PDF][PDF] Learning from demonstration for shaping through inverse reinforcement learning

Near optimal reward-free reinforcement learning

Learning long-term reward redistribution via randomized return decomposition

Autonomous reinforcement learning: Formalism and benchmarking

Belief reward shaping in reinforcement learning

A unifying view of optimism in episodic reinforcement learning

A state-distribution matching approach to non-episodic reinforcement learning

[图书][B] Theory and application of reward shaping in reinforcement learning

相关搜索

引用