On the complexity of solving Markov decision problems

A Rudenko, L Palmieri, M Herman… - … Journal of Robotics …, 2020 - journals.sagepub.com

With growing numbers of intelligent autonomous systems in human environments, the ability
of such systems to perceive, understand, and anticipate human behavior becomes …

被引用次数：811 相关文章所有 17 个版本

Reinforcement learning for demand response: A review of algorithms and modeling techniques

JR Vázquez-Canteli, Z Nagy - Applied energy, 2019 - Elsevier

Buildings account for about 40% of the global energy consumption. Renewable energy
resources are one possibility to mitigate the dependence of residential buildings on the …

被引用次数：683 相关文章所有 7 个版本

[PDF] acm.org

Minimum cost flows, MDPs, and ℓ₁-regression in nearly linear time for dense instances

J Van Den Brand, YT Lee, YP Liu, T Saranurak… - Proceedings of the 53rd …, 2021 - dl.acm.org

In this paper we provide new randomized algorithms with improved runtimes for solving
linear programs with two-sided constraints. In the special case of the minimum cost flow …

被引用次数：141 相关文章所有 8 个版本

[PDF] csudh.edu

An approach for service function chain routing and virtual function network instance migration in network function virtualization architectures

V Eramo, E Miucci, M Ammar… - IEEE/ACM Transactions …, 2017 - ieeexplore.ieee.org

Network function virtualization foresees the virtualization of service functions and their
execution on virtual machines. Any service is represented by a service function chain (SFC) …

被引用次数：366 相关文章所有 7 个版本

[PDF] neurips.cc

Near-optimal time and sample complexities for solving Markov decision processes with a generative model

A Sidford, M Wang, X Wu, L Yang… - Advances in Neural …, 2018 - proceedings.neurips.cc

In this paper we consider the problem of computing an $\epsilon $-optimal policy of a
discounted Markov Decision Process (DMDP) provided we can only access its transition …

被引用次数：237 相关文章所有 6 个版本

[PDF] ieee.org

Joint status sampling and updating for minimizing age of information in the Internet of Things

B Zhou, W Saad - IEEE Transactions on Communications, 2019 - ieeexplore.ieee.org

The effective operation of time-critical Internet of things (IoT) applications requires real-time
reporting of fresh status information of underlying physical processes. In this paper, a real …

被引用次数：208 相关文章所有 6 个版本

[PDF] springer.com

On the convergence of projective-simulation–based reinforcement learning in Markov decision processes

WL Boyajian, J Clausen, LM Trenkwalder… - Quantum machine …, 2020 - Springer

In recent years, the interest in leveraging quantum effects for enhancing machine learning
tasks has significantly increased. Many algorithms speeding up supervised and …

被引用次数：744 相关文章所有 16 个版本

Offloading time optimization via Markov decision process in mobile-edge computing

G Yang, L Hou, X He, D He, S Chan… - IEEE internet of things …, 2020 - ieeexplore.ieee.org

Computation offloading from a mobile device to the edge server is an emerging paradigm to
reduce completion latency of intensive computations in mobile-edge computing (MEC). In …

被引用次数：105 相关文章所有 4 个版本

[PDF] academia.edu

[图书][B] Reinforcement learning: An introduction

RS Sutton, AG Barto - 2018 - books.google.com

The significantly expanded and updated new edition of a widely used text on reinforcement
learning, one of the most active research areas in artificial intelligence. Reinforcement …

被引用次数：72174 相关文章所有 54 个版本

[PDF] jair.org

Reinforcement learning: A survey

LP Kaelbling, ML Littman, AW Moore - Journal of artificial intelligence …, 1996 - jair.org

This paper surveys the field of reinforcement learning from a computer-science perspective.
It is written to be accessible to researchers familiar with machine learning. Both the historical …

被引用次数：11706 相关文章所有 77 个版本

高级搜索

QQ 群