Self-improving factory simulation using continuous-time average-reward reinforcement learning

M Hutsebaut-Buysse, K Mets, S Latré - Machine Learning and Knowledge …, 2022 - mdpi.com

Reinforcement learning (RL) allows an agent to solve sequential decision-making problems
by interacting with an environment in a trial-and-error fashion. When these environments are …

被引用次数：83 相关文章所有 8 个版本

[PDF] umass.edu

Recent advances in hierarchical reinforcement learning

AG Barto, S Mahadevan - Discrete event dynamic systems, 2003 - Springer

Reinforcement learning is bedeviled by the curse of dimensionality: the number of
parameters to be learned grows exponentially with the size of any compact encoding of a …

被引用次数：1784 相关文章所有 23 个版本

[PDF] sciencedirect.com

Optimization of global production scheduling with deep reinforcement learning

B Waschneck, A Reichstaller, L Belzner, T Altenmüller… - Procedia Cirp, 2018 - Elsevier

Abstract Industrie 4.0 introduces decentralized, self-organizing and self-learning systems for
production control. At the same time, new machine learning algorithms are getting …

被引用次数：375 相关文章所有 4 个版本

[PDF] sciencedirect.com

Between MDPs and semi-MDPs: A framework for temporal abstraction in reinforcement learning

RS Sutton, D Precup, S Singh - Artificial intelligence, 1999 - Elsevier

Learning, planning, and representing knowledge at multiple levels of temporal abstraction
are key, longstanding challenges for AI. In this paper we consider how these challenges can …

被引用次数：4537 相关文章所有 39 个版本

[PDF] jmlr.org

[PDF][PDF] Finite-Time Bounds for Fitted Value Iteration.

R Munos, C Szepesvári - Journal of Machine Learning Research, 2008 - jmlr.org

In this paper we develop a theoretical analysis of the performance of sampling-based fitted
value iteration (FVI) to solve infinite state-space, discounted-reward Markovian decision …

被引用次数：642 相关文章所有 22 个版本

Reinforcement learning for predictive maintenance: A systematic technical review

R Siraskar, S Kumar, S Patil, A Bongale… - Artificial Intelligence …, 2023 - Springer

The manufacturing world is subject to ever-increasing cost optimization pressures.
Maintenance adds to cost and disrupts production; optimized maintenance is therefore of …

被引用次数：27 相关文章所有 2 个版本

[PDF] researchgate.net

Deep reinforcement learning for semiconductor production scheduling

B Waschneck, A Reichstaller, L Belzner… - 2018 29th annual …, 2018 - ieeexplore.ieee.org

Despite producing tremendous success stories by identifying cat videos [1] or solving
computer as well as board games [2],[3], the adoption of deep learning in the semiconductor …

被引用次数：148 相关文章所有 4 个版本

[图书][B] Temporal abstraction in reinforcement learning

D Precup - 2000 - search.proquest.com

Decision making usually involves choosing among different courses of action over a broad
range of time scales. For instance, a person planning a trip to a distant location makes high …

被引用次数：405 相关文章所有 5 个版本

[PDF] aston.ac.uk

Relax: Incorporating uncertainty into the specification of self-adaptive systems

J Whittle, P Sawyer, N Bencomo… - 2009 17th IEEE …, 2009 - ieeexplore.ieee.org

Self-adaptive systems have the capability to autonomously modify their behaviour at run-
time in response to changes in their environment. Self-adaptation is particularly necessary …

被引用次数：385 相关文章所有 17 个版本

[PDF] googleapis.com

System and method for dynamic multi-objective optimization of machine selection, integration and utilization

A Sustaeta, KH Lin, R Snyder, JC Theron… - US Patent …, 2017 - Google Patents

The invention provides control systems and methodologies for controlling a process having
computer-controlled equipment, which provide for optimized process performance according …

被引用次数：338 相关文章所有 4 个版本

高级搜索

QQ 群