作者
Patrick Mannion, Sam Devlin, Jim Duggan, Enda Howley
发表日期
2018/12/4
期刊
The Knowledge Engineering Review
卷号
33
期号
e23
出版商
Cambridge University Press
简介
The majority of multi-agent reinforcement learning (MARL) implementations aim to optimize systems with respect to a single objective, despite the fact that many real-world problems are inherently multi-objective in nature. Research into multi-objective MARL is still in its infancy, and few studies to date have dealt with the issue of credit assignment. Reward shaping has been proposed as a means to address the credit assignment problem in single-objective MARL, however it has been shown to alter the intended goals of a domain if misused, leading to unintended behaviour. Two popular shaping methods are potential-based reward shaping and difference rewards, and both have been repeatedly shown to improve learning speed and the quality of joint policies learned by agents in single-objective MARL domains. This work discusses the theoretical implications of applying these shaping approaches to cooperative …
引用总数
2018201920202021202220232024126816178
学术搜索中的文章