作者
Arpit Garg, Hao-Tien Lewis Chiang, Satomi Sugaya, Aleksandra Faust, Lydia Tapia
发表日期
2019/11/3
研讨会论文
2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)
页码范围
3534-3541
出版商
IEEE
简介
Deep Reinforcement Learning (RL) has recently emerged as a solution for moving obstacle avoidance. Deep RL learns to simultaneously predict obstacle motions and corresponding avoidance actions directly from robot sensors, even for obstacles with different dynamics models. However, deep RL methods typically cannot guarantee policy convergences, i.e., cannot provide probabilistic collision avoidance guarantees. In contrast, stochastic reachability (SR), a computationally expensive formal method that employs a known obstacle dynamics model, identifies the optimal avoidance policy and provides strict convergence guarantees. The availability of the optimal solution for versions of the moving obstacle problem provides a baseline to compare trained deep RL policies. In this paper, we compare the expected cumulative reward and actions of these policies to SR, and find the following. 1) The state-value …
引用总数
2019202020212022202311522
学术搜索中的文章
A Garg, HTL Chiang, S Sugaya, A Faust, L Tapia - 2019 IEEE/RSJ International Conference on Intelligent …, 2019