[PDF][PDF] Efficient human following using reinforcement learning

A Bayoumi, M Bennewitz - Workshop on Machine Learning in Planning …, 2015 - cs.unm.edu
Workshop on Machine Learning in Planning and Control of Robot Motion, 2015cs.unm.edu
In this paper, we present an approach that relies on machine learning techniques to follow
people efficiently during robotic assistance tasks, in which the robot is mainly interested in
reaching the final navigation goal of the human. People can perform unexpected actions
during navigation, which can lead to inefficient trajectories to the target destination (ex:
answer land-line phones... etc). Therefore, the following robot should infer the human's
intended navigation goal and intelligently plan its own path to reach it, instead of just …
Abstract
In this paper, we present an approach that relies on machine learning techniques to follow people efficiently during robotic assistance tasks, in which the robot is mainly interested in reaching the final navigation goal of the human. People can perform unexpected actions during navigation, which can lead to inefficient trajectories to the target destination (ex: answer land-line phones... etc). Therefore, the following robot should infer the human’s intended navigation goal and intelligently plan its own path to reach it, instead of just following the human’s path. We propose a novel learning framework to generate such an efficient navigation strategy for the robot. In particular, we apply reinforcement learning from which we get a Q-function that computes for each pair of robot and human positions the best navigation action for the robot. Our approach applies a prediction of the human’s motion based on a softened Markov decision process (MDP). This MDP is independent from the navigation learning framework and is learned beforehand based on previously observed trajectories. We thoroughly evaluated our approach in simulation and on a real robot. As the experimental results demonstrate, our approach leads to an efficient navigation behavior during the following task and can significantly reduce the path length and completion time compared to naive following strategies.
cs.unm.edu
以上显示的是最相近的搜索结果。 查看全部搜索结果