Decision-theoretic planning: Structural assumptions and computational leverage

C Boutilier, T Dean, S Hanks - Journal of Artificial Intelligence Research, 1999 - jair.org
Planning under uncertainty is a central problem in the study of automated sequential
decision making, and has been addressed by researchers in many different fields, including …

Between MDPs and semi-MDPs: A framework for temporal abstraction in reinforcement learning

RS Sutton, D Precup, S Singh - Artificial intelligence, 1999 - Elsevier
Learning, planning, and representing knowledge at multiple levels of temporal abstraction
are key, longstanding challenges for AI. In this paper we consider how these challenges can …

A sparse sampling algorithm for near-optimal planning in large Markov decision processes

M Kearns, Y Mansour, AY Ng - Machine learning, 2002 - Springer
A critical issue for the application of Markov decision processes (MDPs) to realistic problems
is how the complexity of planning scales with the size of the MDP. In stochastic …

Efficient solution algorithms for factored MDPs

C Guestrin, D Koller, R Parr, S Venkataraman - Journal of Artificial …, 2003 - jair.org
This paper addresses the problem of planning under uncertainty in large Markov Decision
Processes (MDPs). Factored MDPs represent a complex state space using state variables …

Stochastic dynamic programming with factored representations

C Boutilier, R Dearden, M Goldszmidt - Artificial intelligence, 2000 - Elsevier
Markov decision processes (MDPs) have proven to be popular models for decision-theoretic
planning, but standard dynamic programming algorithms for solving MDPs rely on explicit …

[图书][B] Temporal abstraction in reinforcement learning

D Precup - 2000 - search.proquest.com
Decision making usually involves choosing among different courses of action over a broad
range of time scales. For instance, a person planning a trip to a distant location makes high …

Scalable reinforcement learning of localized policies for multi-agent networked systems

G Qu, A Wierman, N Li - Learning for Dynamics and Control, 2020 - proceedings.mlr.press
We study reinforcement learning (RL) in a setting with a network of agents whose states and
actions interact in a local manner where the objective is to find localized policies such that …

Online self-reconfiguration with performance guarantee for energy-efficient large-scale cloud computing data centers

H Mi, H Wang, G Yin, Y Zhou, D Shi… - 2010 IEEE International …, 2010 - ieeexplore.ieee.org
In a typical large-scale data center, a set of applications are hosted over virtual machines
(VMs) running on a large number of physical machines (PMs). Such a virtualization …

[PDF][PDF] Efficient reinforcement learning in factored MDPs

M Kearns, D Koller - IJCAI, 1999 - Citeseer
We present a provably efficient and near-optimal algorithm for reinforcement learning in
Markov decision processes (MDPs) whose transition model can be factored as a dynamic …

[图书][B] Exploiting structure to efficiently solve large scale partially observable Markov decision processes

P Poupart - 2005 - Citeseer
Partially observable Markov decision processes (POMDPs) provide a natural and principled
framework to model a wide range of sequential decision making problems under …