- 学术资源搜索

[图书][B] Markov decision processes with applications to finance

N Bäuerle, U Rieder - 2011 - books.google.com

The theory of Markov decision processes focuses on controlled Markov chains in discrete
time. The authors establish the theory for general state and action spaces and at the same …

被引用次数：559 相关文章所有 11 个版本

[PDF] jmlr.org

Approximate information state for approximate planning and reinforcement learning in partially observed systems

J Subramanian, A Sinha, R Seraj, A Mahajan - Journal of Machine …, 2022 - jmlr.org

We propose a theoretical framework for approximate planning and learning in partially
observed systems. Our framework is based on the fundamental notion of information state …

被引用次数：100 相关文章所有 14 个版本

[PDF] danzhang.com

Revenue management for parallel flights with customer-choice behavior

D Zhang, WL Cooper - Operations Research, 2005 - pubsonline.informs.org

We consider the simultaneous seat-inventory control of a set of parallel flights between a
common origin and destination with dynamic customer choice among the flights. We …

被引用次数：330 相关文章所有 15 个版本

[PDF] normferns.com

Bisimulation metrics for continuous Markov decision processes

N Ferns, P Panangaden, D Precup - SIAM Journal on Computing, 2011 - SIAM

In recent years, various metrics have been developed for measuring the behavioral similarity
of states in probabilistic transition systems [J. Desharnais et al., Proceedings of …

被引用次数：170 相关文章所有 13 个版本

[PDF] arxiv.org

An approximate dynamic programming algorithm for monotone value functions

DR Jiang, WB Powell - Operations research, 2015 - pubsonline.informs.org

Many sequential decision problems can be formulated as Markov decision processes
(MDPs) where the optimal value function (or cost-to-go function) can be shown to satisfy a …

被引用次数：98 相关文章所有 14 个版本

[PDF] ucla.edu

Structural properties of stochastic dynamic programs

JE Smith, KF McCardle - Operations Research, 2002 - pubsonline.informs.org

In Markov models of sequential decision processes, one is often interested in showing that
the value function is monotonic, convex, and/or supermodular in the state variables. These …

被引用次数：149 相关文章所有 9 个版本

[PDF] dtic.mil

Multitime scale Markov decision processes

HS Chang, PJ Fard, SI Marcus… - IEEE Transactions on …, 2003 - ieeexplore.ieee.org

This paper proposes a simple analytical model called M time scale Markov decision process
(MMDPs) for hierarchically structured sequential decision making processes, where …

被引用次数：110 相关文章所有 13 个版本

Lipschitz continuity of value functions in Markovian decision processes

K Hinderer - Mathematical Methods of Operations Research, 2005 - Springer

We present tools and guidelines for investigating Lipschitz continuity of the value functions
in MDP's, using the Hausdorff metric and the Kantorovich metric for measuring the influence …

被引用次数：87 相关文章所有 9 个版本

[图书][B] The economics of search

B McCall, J McCall - 2007 - taylorfrancis.com

The economics of search is a prominent component of economic theory, and it has a
richness and elegance that underpins a host of practical applications. In this book Brian and …

被引用次数：76 相关文章所有 8 个版本

[PDF] arxiv.org

Agent-state based policies in POMDPs: Beyond belief-state MDPs

A Sinha, A Mahajan - arXiv preprint arXiv:2409.15703, 2024 - arxiv.org

The traditional approach to POMDPs is to convert them into fully observed MDPs by
considering a belief state as an information state. However, a belief-state based approach …

被引用次数：2 相关文章所有 3 个版本

高级搜索

QQ 群