查看文章

neurips.cc 中的 [PDF]

DESPOT: Online POMDP planning with regularization

作者

Adhiraj Somani, Nan Ye, David Hsu, Wee Sun Lee

发表日期

2013

研讨会论文

Advances in neural information processing systems

页码范围

1772-1780

简介

POMDPs provide a principled framework for planning under uncertainty, but are computationally intractable, due to the “curse of dimensionality” and the “curse of history”. This paper presents an online lookahead search algorithm that alleviates these difficulties by limiting the search to a set of sampled scenarios. The execution of all policies on the sampled scenarios is summarized using a Determinized Sparse Partially Observable Tree (DESPOT), which is a sparsely sampled belief tree. Our algorithm, named Regularized DESPOT (R-DESPOT), searches the DESPOT for a policy that optimally balances the size of the policy and the accuracy on its value estimate obtained through sampling. We give an output-sensitive performance bound for all policies derived from the DESPOT, and show that R-DESPOT works well if a small optimal policy exists. We also give an anytime approximation to R-DESPOT. Experiments show strong results, compared with two of the fastest online POMDP algorithms.

引用总数

被引用次数：552

2014201520162017201820192020202120222023202411 22 29 33 43 71 86 73 64 85 32

学术搜索中的文章

DESPOT: Online POMDP planning with regularization

A Somani, N Ye, D Hsu, WS Lee - Advances in neural information processing systems, 2013

被引用次数：552 相关文章所有 13 个版本