[图书][B] Markov decision processes with applications to finance

N Bäuerle, U Rieder - 2011 - books.google.com
The theory of Markov decision processes focuses on controlled Markov chains in discrete
time. The authors establish the theory for general state and action spaces and at the same …

Approximate information state for approximate planning and reinforcement learning in partially observed systems

J Subramanian, A Sinha, R Seraj, A Mahajan - Journal of Machine …, 2022 - jmlr.org
We propose a theoretical framework for approximate planning and learning in partially
observed systems. Our framework is based on the fundamental notion of information state …

Revenue management for parallel flights with customer-choice behavior

D Zhang, WL Cooper - Operations Research, 2005 - pubsonline.informs.org
We consider the simultaneous seat-inventory control of a set of parallel flights between a
common origin and destination with dynamic customer choice among the flights. We …

Bisimulation metrics for continuous Markov decision processes

N Ferns, P Panangaden, D Precup - SIAM Journal on Computing, 2011 - SIAM
In recent years, various metrics have been developed for measuring the behavioral similarity
of states in probabilistic transition systems [J. Desharnais et al., Proceedings of …

An approximate dynamic programming algorithm for monotone value functions

DR Jiang, WB Powell - Operations research, 2015 - pubsonline.informs.org
Many sequential decision problems can be formulated as Markov decision processes
(MDPs) where the optimal value function (or cost-to-go function) can be shown to satisfy a …

Structural properties of stochastic dynamic programs

JE Smith, KF McCardle - Operations Research, 2002 - pubsonline.informs.org
In Markov models of sequential decision processes, one is often interested in showing that
the value function is monotonic, convex, and/or supermodular in the state variables. These …

Multitime scale Markov decision processes

HS Chang, PJ Fard, SI Marcus… - IEEE Transactions on …, 2003 - ieeexplore.ieee.org
This paper proposes a simple analytical model called M time scale Markov decision process
(MMDPs) for hierarchically structured sequential decision making processes, where …

Lipschitz continuity of value functions in Markovian decision processes

K Hinderer - Mathematical Methods of Operations Research, 2005 - Springer
We present tools and guidelines for investigating Lipschitz continuity of the value functions
in MDP's, using the Hausdorff metric and the Kantorovich metric for measuring the influence …

[图书][B] The economics of search

B McCall, J McCall - 2007 - taylorfrancis.com
The economics of search is a prominent component of economic theory, and it has a
richness and elegance that underpins a host of practical applications. In this book Brian and …

Agent-state based policies in POMDPs: Beyond belief-state MDPs

A Sinha, A Mahajan - arXiv preprint arXiv:2409.15703, 2024 - arxiv.org
The traditional approach to POMDPs is to convert them into fully observed MDPs by
considering a belief state as an information state. However, a belief-state based approach …