Hierarchical reinforcement learning: A survey and open research challenges

M Hutsebaut-Buysse, K Mets, S Latré - Machine Learning and Knowledge …, 2022 - mdpi.com
Reinforcement learning (RL) allows an agent to solve sequential decision-making problems
by interacting with an environment in a trial-and-error fashion. When these environments are …

Recent advances in hierarchical reinforcement learning

AG Barto, S Mahadevan - Discrete event dynamic systems, 2003 - Springer
Reinforcement learning is bedeviled by the curse of dimensionality: the number of
parameters to be learned grows exponentially with the size of any compact encoding of a …

Optimization of global production scheduling with deep reinforcement learning

B Waschneck, A Reichstaller, L Belzner, T Altenmüller… - Procedia Cirp, 2018 - Elsevier
Abstract Industrie 4.0 introduces decentralized, self-organizing and self-learning systems for
production control. At the same time, new machine learning algorithms are getting …

Between MDPs and semi-MDPs: A framework for temporal abstraction in reinforcement learning

RS Sutton, D Precup, S Singh - Artificial intelligence, 1999 - Elsevier
Learning, planning, and representing knowledge at multiple levels of temporal abstraction
are key, longstanding challenges for AI. In this paper we consider how these challenges can …

[PDF][PDF] Finite-Time Bounds for Fitted Value Iteration.

R Munos, C Szepesvári - Journal of Machine Learning Research, 2008 - jmlr.org
In this paper we develop a theoretical analysis of the performance of sampling-based fitted
value iteration (FVI) to solve infinite state-space, discounted-reward Markovian decision …

Reinforcement learning for predictive maintenance: A systematic technical review

R Siraskar, S Kumar, S Patil, A Bongale… - Artificial Intelligence …, 2023 - Springer
The manufacturing world is subject to ever-increasing cost optimization pressures.
Maintenance adds to cost and disrupts production; optimized maintenance is therefore of …

Deep reinforcement learning for semiconductor production scheduling

B Waschneck, A Reichstaller, L Belzner… - 2018 29th annual …, 2018 - ieeexplore.ieee.org
Despite producing tremendous success stories by identifying cat videos [1] or solving
computer as well as board games [2],[3], the adoption of deep learning in the semiconductor …

[图书][B] Temporal abstraction in reinforcement learning

D Precup - 2000 - search.proquest.com
Decision making usually involves choosing among different courses of action over a broad
range of time scales. For instance, a person planning a trip to a distant location makes high …

Relax: Incorporating uncertainty into the specification of self-adaptive systems

J Whittle, P Sawyer, N Bencomo… - 2009 17th IEEE …, 2009 - ieeexplore.ieee.org
Self-adaptive systems have the capability to autonomously modify their behaviour at run-
time in response to changes in their environment. Self-adaptation is particularly necessary …

System and method for dynamic multi-objective optimization of machine selection, integration and utilization

A Sustaeta, KH Lin, R Snyder, JC Theron… - US Patent …, 2017 - Google Patents
The invention provides control systems and methodologies for controlling a process having
computer-controlled equipment, which provide for optimized process performance according …