[PDF][PDF] Probabilistic planning with non-linear utility functions and worst-case guarantees

S Ermon, C Gomes, B Selman… - Proceedings of the 11th …, 2012 - aamas.csc.liv.ac.uk
Proceedings of the 11th International Conference on Autonomous …, 2012aamas.csc.liv.ac.uk
ABSTRACT Markov Decision Processes are one of the most widely used frameworks to
formulate probabilistic planning problems. Since planners are often risk-sensitive in high-
stake situations, non-linear utility functions are often introduced to describe their preferences
among all possible outcomes. Alternatively, risk-sensitive decision makers often require their
plans to satisfy certain worst-case guarantees. We show how to combine these two
approaches by considering problems where we maximize the expected utility of the total …
Abstract
Markov Decision Processes are one of the most widely used frameworks to formulate probabilistic planning problems. Since planners are often risk-sensitive in high-stake situations, non-linear utility functions are often introduced to describe their preferences among all possible outcomes. Alternatively, risk-sensitive decision makers often require their plans to satisfy certain worst-case guarantees. We show how to combine these two approaches by considering problems where we maximize the expected utility of the total reward subject to worst-case constraints. We generalize several existing results on the structure of optimal policies to the constrained case, both for finite and infinite horizon problems. We provide a Dynamic Programming algorithm to compute the optimal policy, and we introduce an admissible heuristic to effectively prune the search space. Finally, we use a stochastic shortest path problem on large real-world road networks to demonstrate the practical applicability of our method.
aamas.csc.liv.ac.uk
以上显示的是最相近的搜索结果。 查看全部搜索结果