(PBVI) algorithm for POMDP planning. PBVI approximates an exact value iteration solution by selecting a small set of representative belief points and then tracking the value and its …
Determining the best management actions is challenging when critical information is missing. However, urgency and limited resources require that decisions must be made …
MTJ Spaan, N Vlassis - Journal of artificial intelligence research, 2005 - jair.org
Partially observable Markov decision processes (POMDPs) form an attractive and principled framework for agent planning under uncertainty. Point-based approximate techniques for …
M Hauskrecht - Journal of artificial intelligence research, 2000 - jair.org
Partially observable Markov decision processes (POMDPs) provide an elegant mathematical framework for modeling complex decision and planning problems in …
T Smith, R Simmons - arXiv preprint arXiv:1207.4166, 2012 - arxiv.org
We present a novel POMDP planning algorithm called heuristic search value iteration (HSVI). HSVI is an anytime algorithm that returns a policy and a provable bound on its regret …
This paper explains how Partially Observable Markov Decision Processes (POMDPs) can provide a principled mathematical framework for modelling the inherent uncertainty in …
RA Brooks, C Breazeal, M Marjanović… - … on computation for …, 1998 - Springer
To explore issues of developmental structure, physical embodiment, integration of multiple sensory and motor systems, and social interaction, we have constructed an upper-torso …
For reinforcement learning in environments in which an agent has access to a reliable state signal, methods based on the Markov decision process (MDP) have had many successes. In …
Abstract The Partially Observable Markov Decision Process has long been recognized as a rich framework for real-world planning and control problems, especially in robotics. However …