Anytime point-based approximations for large POMDPs

DM Roijers, P Vamplew, S Whiteson… - Journal of Artificial …, 2013 - jair.org

Sequential decision-making problems with multiple objectives arise naturally in practice and
pose unique challenges for research in decision-theoretic planning and learning, which has …

被引用次数：751 相关文章所有 21 个版本

[PDF] arxiv.org

Partially observable markov decision processes in robotics: A survey

M Lauri, D Hsu, J Pajarinen - IEEE Transactions on Robotics, 2022 - ieeexplore.ieee.org

Noisy sensing, imperfect control, and environment changes are defining characteristics of
many real-world robot tasks. The partially observable Markov decision process (POMDP) …

被引用次数：86 相关文章所有 7 个版本

[PDF] neurips.cc

Monte-Carlo planning in large POMDPs

D Silver, J Veness - Advances in neural information …, 2010 - proceedings.neurips.cc

This paper introduces a Monte-Carlo algorithm for online planning in large POMDPs. The
algorithm combines a Monte-Carlo update of the agent's belief state with a Monte-Carlo tree …

被引用次数：1451 相关文章所有 15 个版本

[PDF] psu.edu

A survey of point-based POMDP solvers

G Shani, J Pineau, R Kaplow - Autonomous Agents and Multi-Agent …, 2013 - Springer

The past decade has seen a significant breakthrough in research on solving partially
observable Markov decision processes (POMDPs). Where past solvers could not scale …

被引用次数：753 相关文章所有 12 个版本

[PDF] uliege.be

[图书][B] Reinforcement learning and dynamic programming using function approximators

L Busoniu, R Babuska, B De Schutter, D Ernst - 2017 - taylorfrancis.com

From household appliances to applications in robotics, engineered systems involving
complex dynamics can only be as effective as the algorithms that control them. While …

被引用次数：1252 相关文章所有 12 个版本

[PDF] jair.org

Adaptive submodularity: Theory and applications in active learning and stochastic optimization

D Golovin, A Krause - Journal of Artificial Intelligence Research, 2011 - jair.org

Many problems in artificial intelligence require adaptively making a sequence of decisions
with uncertain outcomes under partial observability. Solving such stochastic optimization …

被引用次数：814 相关文章所有 23 个版本

[PDF] psu.edu

Autonomous driving in urban environments: approaches, lessons and challenges

M Campbell, M Egerstedt, JP How… - … Transactions of the …, 2010 - royalsocietypublishing.org

The development of autonomous vehicles for urban driving has seen rapid progress in the
past 30 years. This paper provides a summary of the current state of the art in autonomous …

被引用次数：541 相关文章所有 16 个版本

[PDF] jair.org

Online planning algorithms for POMDPs

S Ross, J Pineau, S Paquet, B Chaib-Draa - Journal of Artificial Intelligence …, 2008 - jair.org

Abstract Partially Observable Markov Decision Processes (POMDPs) provide a rich
framework for sequential decision-making under uncertainty in stochastic domains …

被引用次数：726 相关文章所有 25 个版本

[PDF] neurips.cc

Rl for latent mdps: Regret guarantees and a lower bound

J Kwon, Y Efroni, C Caramanis… - Advances in Neural …, 2021 - proceedings.neurips.cc

In this work, we consider the regret minimization problem for reinforcement learning in latent
Markov Decision Processes (LMDP). In an LMDP, an MDP is randomly drawn from a set of …

被引用次数：73 相关文章所有 7 个版本

[PDF] nowpublishers.com

From bandits to monte-carlo tree search: The optimistic principle applied to optimization and planning

R Munos - Foundations and Trends® in Machine Learning, 2014 - nowpublishers.com

This work covers several aspects of the optimism in the face of uncertainty principle applied
to large scale optimization problems under finite numerical budget. The initial motivation for …

被引用次数：298 相关文章所有 18 个版本

高级搜索

QQ 群