Shaping Mario with Human Advice.

H Sami, J Bentahar, A Mourad, H Otrok, E Damiani - Information Sciences, 2022 - Elsevier

In this paper, we consider the problem of low-speed convergence in Reinforcement
Learning (RL). As a solution, various potential-based reward shaping techniques were …

被引用次数：24 相关文章所有 4 个版本

[PDF] liv.ac.uk

[PDF][PDF] Learning from demonstration for shaping through inverse reinforcement learning

HB Suay, T Brys, ME Taylor… - Proceedings of the 2016 …, 2016 - aamas.csc.liv.ac.uk

Model-free episodic reinforcement learning problems define the environment reward with
functions that often provide only sparse information throughout the task. Consequently …

被引用次数：100 相关文章所有 6 个版本

[PDF] ieee.org

Subgoal-based reward shaping to improve efficiency in reinforcement learning

T Okudo, S Yamada - IEEE Access, 2021 - ieeexplore.ieee.org

Reinforcement learning, which acquires a policy maximizing long-term rewards, has been
actively studied. Unfortunately, this learning type is too slow and difficult to use in practical …

被引用次数：23 相关文章所有 5 个版本

[PDF] neurips.cc

Reward propagation using graph convolutional networks

M Klissarov, D Precup - Advances in Neural Information …, 2020 - proceedings.neurips.cc

Potential-based reward shaping provides an approach for designing good reward functions,
with the purpose of speeding up learning. However, automatically finding potential functions …

被引用次数：23 相关文章所有 5 个版本

[PDF] neurips.cc

Reinforcement learning with multiple experts: A bayesian model combination approach

M Gimelfarb, S Sanner, CG Lee - Advances in neural …, 2018 - proceedings.neurips.cc

Potential based reward shaping is a powerful technique for accelerating convergence of
reinforcement learning algorithms. Typically, such information includes an estimate of the …

被引用次数：29 相关文章所有 6 个版本

[PDF] mdpi.com

Learning human strategies for tuning cavity filters with continuous reinforcement learning

Z Wang, Y Ou - Applied Sciences, 2022 - mdpi.com

Learning to master human intentions and behave more humanlike is an ultimate goal for
autonomous agents. To achieve that, higher requirements for intelligence are imposed. In …

被引用次数：5 相关文章所有 6 个版本

[PDF] arxiv.org

Reward shaping using convolutional neural network

H Sami, H Otrok, J Bentahar, A Mourad, E Damiani - Information Sciences, 2023 - Elsevier

In this paper, we propose Value Iteration Network for Reward Shaping (VIN-RS), a potential-
based reward shaping mechanism using Convolutional Neural Network (CNN). The …

被引用次数：3 相关文章所有 6 个版本

[PDF] arxiv.org

Simultaneous control and human feedback in the training of a robotic agent with actor-critic reinforcement learning

KW Mathewson, PM Pilarski - arXiv preprint arXiv:1606.06979, 2016 - arxiv.org

This paper contributes a preliminary report on the advantages and disadvantages of
incorporating simultaneous human control and feedback signals in the training of a …

被引用次数：20 相关文章所有 6 个版本

[PDF] arxiv.org

Boosting Reinforcement Learning Algorithms in Continuous Robotic Reaching Tasks using Adaptive Potential Functions

Y Chen, L Schomaker, F Cruz - arXiv preprint arXiv:2402.04581, 2024 - arxiv.org

In reinforcement learning, reward shaping is an efficient way to guide the learning process of
an agent, as the reward can indicate the optimal policy of the task. The potential-based …

Integrating skills and simulation to solve complex navigation tasks in Infinite Mario

M Dann, F Zambetta… - IEEE Transactions on …, 2017 - ieeexplore.ieee.org

Aside from hand-coded bots, most of the videogame agents rely on experience in one way
or another. Some agents improve over time by adjusting to real experience, while others …

被引用次数：10 相关文章所有 3 个版本

高级搜索

QQ 群