Stochastic approximations for finite-state Markov chains

E Altman - 2021 - taylorfrancis.com

This book provides a unified approach for the study of constrained Markov decision
processes with a finite state space and unbounded costs. Unlike the single controller case …

被引用次数：2711 相关文章所有 14 个版本

[PDF] neurips.cc

Zap Q-learning

AM Devraj, S Meyn - Advances in Neural Information …, 2017 - proceedings.neurips.cc

The Zap Q-learning algorithm introduced in this paper is an improvement of Watkins' original
algorithm and recent competitors in several respects. It is a matrix-gain algorithm designed …

被引用次数：105 相关文章所有 6 个版本

[PDF] arxiv.org

Two time-scale stochastic approximation with controlled Markov noise and off-policy temporal-difference learning

P Karmakar, S Bhatnagar - Mathematics of Operations …, 2018 - pubsonline.informs.org

We present for the first time an asymptotic convergence analysis of two time-scale stochastic
approximation driven by “controlled” Markov noise. In particular, the faster and slower …

被引用次数：71 相关文章所有 10 个版本

[PDF] sciencedirect.com

On recursive estimation for hidden Markov models

T Rydén - Stochastic Processes and their Applications, 1997 - Elsevier

Hidden Markov models (HMMs) have during the last decade become a widespread tool for
modelling sequences of dependent random variables. In this paper we consider a recursive …

被引用次数：122 相关文章所有 7 个版本

[PDF] arxiv.org

Online statistical inference for nonlinear stochastic approximation with markovian data

X Li, J Liang, Z Zhang - arXiv preprint arXiv:2302.07690, 2023 - arxiv.org

We study the statistical inference of nonlinear stochastic approximation algorithms utilizing a
single trajectory of Markovian data. Our methodology has practical applications in various …

被引用次数：10 相关文章所有 2 个版本

[PDF] google.com

Fundamental design principles for reinforcement learning algorithms

AM Devraj, A Bušić, S Meyn - Handbook of Reinforcement Learning and …, 2021 - Springer

Along with the sharp increase in visibility of the field, the rate at which new reinforcement
learning algorithms are being proposed is at a new peak. While the surge in activity is …

被引用次数：22 相关文章所有 7 个版本

[PDF] arxiv.org

Fastest convergence for Q-learning

AM Devraj, SP Meyn - arXiv preprint arXiv:1707.03770, 2017 - arxiv.org

The Zap Q-learning algorithm introduced in this paper is an improvement of Watkins' original
algorithm and recent competitors in several respects. It is a matrix-gain algorithm designed …

被引用次数：48 相关文章所有 4 个版本

[PDF] arxiv.org

Fundamental limits of remote estimation of autoregressive Markov processes under communication constraints

J Chakravorty, A Mahajan - 2016 Information Theory and …, 2016 - ieeexplore.ieee.org

The fundamental limits of remote estimation of autoregressive Markov processes under
communication constraints are presented. The remote estimation system consists of a …

被引用次数：45 相关文章所有 10 个版本

The algorithmic learning equations: Evolving strategies in dynamic games

A Cartea, P Chang, J Penalva… - Available at SSRN …, 2022 - papers.ssrn.com

We introduce the algorithmic learning equations, a set of ordinary differential equations
which characterizes the finite-time and asymptotic behavior of the stochastic interaction …

被引用次数：11 相关文章

[PDF] informs.org

Asynchronous stochastic approximation with differential inclusions

S Perkins, DS Leslie - Stochastic Systems, 2013 - pubsonline.informs.org

The asymptotic pseudo-trajectory approach to stochastic approximation of Benaïm,
Hofbauer and Sorin is extended for asynchronous stochastic approximations with a set …

被引用次数：47 相关文章所有 10 个版本

高级搜索

QQ 群