Reinforcement learning for long-run average cost

P Seele, C Dierksmeier, R Hofstetter… - Journal of Business …, 2021 - Springer

Firms increasingly deploy algorithmic pricing approaches to determine what to charge for
their goods and services. Algorithmic pricing can discriminate prices both dynamically over …

被引用次数：161 相关文章所有 13 个版本

[HTML] acm.org

Reinforcement learning: A tutorial survey and recent advances

A Gosavi - INFORMS Journal on Computing, 2009 - pubsonline.informs.org

In the last few years, reinforcement learning (RL), also called adaptive (or approximate)
dynamic programming, has emerged as a powerful tool for solving complex sequential …

被引用次数：436 相关文章所有 15 个版本

[PDF] bookfusion.com

[图书][B] Algorithms for reinforcement learning

C Szepesvári - 2022 - books.google.com

Reinforcement learning is a learning paradigm concerned with learning to control a system
so as to maximize a numerical performance measure that expresses a long-term objective …

被引用次数：2169 相关文章所有 24 个版本

Joint optimization of preventive maintenance and production scheduling for multi-state production systems based on reinforcement learning

H Yang, W Li, B Wang - Reliability Engineering & System Safety, 2021 - Elsevier

Preventive maintenance and production scheduling are two important and interactive
activities in production systems. In this work, the integrated optimization problem of …

被引用次数：87 相关文章所有 4 个版本

[PDF] academia.edu

[图书][B] Simulation-based optimization

A Gosavi - 2015 - Springer

This book is written for students and researchers in the field of industrial engineering,
computer science, operations research, management science, electrical engineering, and …

被引用次数：823 相关文章所有 11 个版本

Reinforcement learning for predictive maintenance: A systematic technical review

R Siraskar, S Kumar, S Patil, A Bongale… - Artificial Intelligence …, 2023 - Springer

The manufacturing world is subject to ever-increasing cost optimization pressures.
Maintenance adds to cost and disrupts production; optimized maintenance is therefore of …

被引用次数：27 相关文章所有 2 个版本

[PDF] researchgate.net

Joint computation offloading and multiuser scheduling using approximate dynamic programming in NB-IoT edge computing system

L Lei, H Xu, X Xiong, K Zheng… - IEEE Internet of Things …, 2019 - ieeexplore.ieee.org

The Internet of Things (IoT) connects a huge number of resource-constraint IoT devices to
the Internet, which generate massive amount of data that can be offloaded to the cloud for …

被引用次数：124 相关文章所有 4 个版本

[PDF] mlr.press

Learning and planning in average-reward markov decision processes

Y Wan, A Naik, RS Sutton - International Conference on …, 2021 - proceedings.mlr.press

We introduce learning and planning algorithms for average-reward MDPs, including 1) the
first general proven-convergent off-policy model-free control algorithm without reference …

被引用次数：70 相关文章所有 9 个版本

[PDF] siam.org

[图书][B] Adaptive treatment strategies in practice: planning trials and analyzing data for personalized medicine

MR Kosorok, EEM Moodie - 2015 - SIAM

The study of new medical treatments, and sequences of treatments, is inextricably linked
with statistics. Without statistical estimation and inference, we are left with case studies and …

被引用次数：172 相关文章所有 6 个版本

[PDF] romisatriawahono.net

Reinforcement learning algorithms with function approximation: Recent advances and applications

X Xu, L Zuo, Z Huang - Information sciences, 2014 - Elsevier

In recent years, the research on reinforcement learning (RL) has focused on function
approximation in learning prediction and control of Markov decision processes (MDPs). The …

被引用次数：219 相关文章所有 6 个版本

高级搜索

QQ 群