Learning optimal parameterized policy for high level strategies in a game setting- 学术资源搜索

Learning optimal parameterized policy for high level strategies in a game setting

R Prakash, M Vohra, L Behera - 2019 28th IEEE International …, 2019 - ieeexplore.ieee.org

2019 28th IEEE International Conference on Robot and Human …, 2019•ieeexplore.ieee.org

Complex and interactive robot manipulation skills such as playing a game of table tennis against a human opponent is a multifaceted challenge and a novel problem. Accurate dynamic trajectory generation in such dynamic situations and an appropriate controller in order to respond to the incoming table tennis ball from the opponent is only a prerequisite to win the game. Decision making is a major part of an intelligent robot and a policy is needed to choose and execute the action which receives highest reward. In this paper, we address this very important problem on how to learn the higher level optimal strategies that enable competitive behaviour with humans in such an interactive game setting. This paper presents a novel technique to learn a higher level strategy for the game of table tennis using P-Q Learning (a mixture of Pavlovian learning and Q-learning) to learn a parameterized policy. The cooperative learning framework of Kohenon Self Organizing Map (KSOM) along with Replay Memory is employed for faster strategy learning in this short horizon problem. The strategy is learnt in simulation, using a simulated human opponent and an ideal robot that can perform hitting motion in its workspace accurately. We show that our method is able to improve the average received reward significantly in comparison to the other state-of-the-art methods.

ieeexplore.ieee.org

展开收起

被引用次数：2 相关文章所有 2 个版本

以上显示的是最相近的搜索结果。查看全部搜索结果

高级搜索

QQ 群

Learning optimal parameterized policy for high level strategies in a game setting

引用