Reward shaping in episodic reinforcement learning

M Grzes - 2017 - kar.kent.ac.uk
Recent advancements in reinforcement learning confirm that reinforcement learning
techniques can solve large scale problems leading to high quality autonomous decision …

Temporal-logic-based reward shaping for continuing reinforcement learning tasks

Y Jiang, S Bharadwaj, B Wu, R Shah, U Topcu… - Proceedings of the …, 2021 - ojs.aaai.org
In continuing tasks, average-reward reinforcement learning may be a more appropriate
problem formulation than the more common discounted reward formulation. As usual …

[PDF][PDF] Learning from demonstration for shaping through inverse reinforcement learning

HB Suay, T Brys, ME Taylor… - Proceedings of the 2016 …, 2016 - aamas.csc.liv.ac.uk
Model-free episodic reinforcement learning problems define the environment reward with
functions that often provide only sparse information throughout the task. Consequently …

Near optimal reward-free reinforcement learning

Z Zhang, S Du, X Ji - International Conference on Machine …, 2021 - proceedings.mlr.press
We study the reward-free reinforcement learning framework, which is particularly suitable for
batch reinforcement learning and scenarios where one needs policies for multiple reward …

Learning long-term reward redistribution via randomized return decomposition

Z Ren, R Guo, Y Zhou, J Peng - arXiv preprint arXiv:2111.13485, 2021 - arxiv.org
Many practical applications of reinforcement learning require agents to learn from sparse
and delayed rewards. It challenges the ability of agents to attribute their actions to future …

Autonomous reinforcement learning: Formalism and benchmarking

A Sharma, K Xu, N Sardana, A Gupta… - arXiv preprint arXiv …, 2021 - arxiv.org
Reinforcement learning (RL) provides a naturalistic framing for learning through trial and
error, which is appealing both because of its simplicity and effectiveness and because of its …

Belief reward shaping in reinforcement learning

O Marom, B Rosman - Proceedings of the AAAI conference on artificial …, 2018 - ojs.aaai.org
A key challenge in many reinforcement learning problems is delayed rewards, which can
significantly slow down learning. Although reward shaping has previously been introduced …

A unifying view of optimism in episodic reinforcement learning

G Neu, C Pike-Burke - Advances in Neural Information …, 2020 - proceedings.neurips.cc
The principle of``optimism in the face of uncertainty''underpins many theoretically successful
reinforcement learning algorithms. In this paper we provide a general framework for …

A state-distribution matching approach to non-episodic reinforcement learning

A Sharma, R Ahmad, C Finn - arXiv preprint arXiv:2205.05212, 2022 - arxiv.org
While reinforcement learning (RL) provides a framework for learning through trial and error,
translating RL algorithms into the real world has remained challenging. A major hurdle to …

[图书][B] Theory and application of reward shaping in reinforcement learning

AD Laud - 2004 - search.proquest.com
Applying conventional reinforcement to complex domains requires the use of an overly
simplified task model, or a large amount of training experience. This problem results from the …