Theoretical and empirical analysis of reward shaping in reinforcement learning

R Zhang, R Yu, W Xia - Information Technology and Control, 2022 - itc.ktu.lt

The vehicle routing problem with time windows (VRPTW) as one of the most known
combinatorial operations (CO) problem is considered to be a tough issue in practice and the …

被引用次数：3 相关文章所有 4 个版本

[PDF] ox.ac.uk

Learning potential functions and their representations for multi-task reinforcement learning

M Snel, S Whiteson - Autonomous agents and multi-agent systems, 2014 - Springer

In multi-task learning, there are roughly two approaches to discovering representations. The
first is to discover task relevant representations, ie, those that compactly represent solutions …

被引用次数：18 相关文章所有 13 个版本

Power Demand Reshaping Using Energy Storage for Distributed Edge Clouds

D Zheng, L Liu, G Tang, Y Wang… - IEEE Transactions on …, 2023 - ieeexplore.ieee.org

The booming edge computing market that is supported by the edge cloud (EC) infrastructure
has brought huge operating costs, mainly the energy cost, to edge service providers. The …

Monte-carlo tree search for policy optimization

X Ma, K Driggs-Campbell, Z Zhang… - arXiv preprint arXiv …, 2019 - arxiv.org

Gradient-based methods are often used for policy optimization in deep reinforcement
learning, despite being vulnerable to local optima and saddle points. Although gradient-free …

被引用次数：8 相关文章所有 6 个版本

[PDF] iospress.nl

[PDF][PDF] Off-policy shaping ensembles in reinforcement learning

A Harutyunyan, T Brys, P Vrancx, A Nowé - ECAI 2014, 2014 - ebooks.iospress.nl

Off-Policy Shaping Ensembles in Reinforcement Learning Page 1 Off-Policy Shaping
Ensembles in Reinforcement Learning Anna Harutyunyan and Tim Brys and Peter Vrancx and …

被引用次数：14 相关文章所有 14 个版本

[PDF] github.io

[PDF][PDF] Using Incomplete and Incorrect Plans to Shape Reinforcement Learning in Long-Sequence Sparse-Reward Tasks

H Müller, D Kudenko - Proc. of the Adaptive and …, 2023 - alaworkshop2023.github.io

Reinforcement learning (RL) agents naturally struggle with longsequence sparse reward
tasks due to the lack of reward feedback during exploration and the problem of identifying …

被引用次数：1 相关文章

[PDF] whiterose.ac.uk

Multi-agent credit assignment in stochastic resource management games

P Mannion, S Devlin, J Duggan… - The Knowledge …, 2017 - cambridge.org

Multi-agent systems (MASs) are a form of distributed intelligence, where multiple
autonomous agents act in a common environment. Numerous complex, real world systems …

被引用次数：7 相关文章所有 10 个版本

[PDF] arxiv.org

Using Contrastive Samples for Identifying and Leveraging Possible Causal Relationships in Reinforcement Learning

H Khadilkar, H Meisheri - Proceedings of the 6th Joint International …, 2023 - dl.acm.org

A significant challenge in reinforcement learning is quantifying the complex relationship
between actions and long-term rewards. The effects may manifest themselves over a long …

被引用次数：1 相关文章所有 3 个版本

A Reinforcement Learning Model for Virtual Machines Consolidation in Cloud Data Center

Q Chou, W Fan, J Zhang - 2021 6th international conference on …, 2021 - ieeexplore.ieee.org

Energy consumption in data center is currently the main focus of many large-scale
enterprises and cloud service providers. Dynamic virtual machine (VM) consolidation …

被引用次数：2 相关文章

[PDF] whiterose.ac.uk

Potential-based reward shaping for knowledge-based, multi-agent reinforcement learning

SM Devlin - 2013 - etheses.whiterose.ac.uk

Reinforcement learning is a robust artificial intelligence solution for agents required to act in
an environment, making their own decisions on how to behave. Typically an agent is …

被引用次数：7 相关文章所有 3 个版本

高级搜索

QQ 群