Combining reinforcement learning and constraint programming for combinatorial optimization

Q Cappart, T Moisan, LM Rousseau… - Proceedings of the …, 2021 - ojs.aaai.org
… Based on a complete search procedure, it will always find the optimal solution if we allow
an execution time large enough. A critical design choice, that makes CP non-trivial to use in …

Reinforcement learning with convex constraints

S Miryoosefi, K Brantley, H Daume III… - Advances in neural …, 2019 - proceedings.neurips.cc
… In standard reinforcement learning (RL), a learning agent … are more naturally expressed
as constraints. For instance, the … of constraints in RL tasks: specifically, any constraints that …

Anytime-Constrained Reinforcement Learning

J McMahan, X Zhu - International Conference on Artificial …, 2024 - proceedings.mlr.press
… Mowbray et al., 2022) yield policies that arbitrarily violate an anytime constraint. This …
blowup for anytime constraints is in the time horizon, we focus on varying the time horizon. …

Constrained reinforcement learning has zero duality gap

S Paternain, L Chamon… - Advances in Neural …, 2019 - proceedings.neurips.cc
… as completing tasks using the least amount of time/energy, learning multiple tasks, or dealing
with multiple opponents. In the context of reinforcement learning (RL), these problems are …

Cautious reinforcement learning with logical constraints

M Hasanbeig, A Abate, D Kroening - arXiv preprint arXiv:2002.12156, 2020 - arxiv.org
… forces Reinforcement Learning (RL) to synthesise optimal control policies while ensuring
safety during the learning … Enforcing the RL agent to stay safe during learning might limit the …

Safe reinforcement learning in constrained markov decision processes

A Wachi, Y Sui - … Conference on Machine Learning, 2020 - proceedings.mlr.press
… Safe reinforcement learningconstraints. Specifically, we take a stepwise approach for
optimizing safety and cumulative reward. In our method, the agent first learns safety constraints by …

Challenges of real-world reinforcement learning

G Dulac-Arnold, D Mankowitz, T Hester - arXiv preprint arXiv:1904.12901, 2019 - arxiv.org
… of the policy through time we index them as πi to indicate the learning iteration. An existing
constraints on the environment in this context. Constrained MDPs define a constrained

Deep reinforcement learning driven inspection and maintenance planning under incomplete information and constraints

CP Andriotis, KG Papakonstantinou - Reliability Engineering & System …, 2021 - Elsevier
time frame, which is able to aptly map states and times to … deterministic constraints that need
to be satisfied over multiple time … the joint context of constrained Partially Observable Markov …

Reinforcement learning–overview of recent progress and implications for process control

J Shin, TA Badgwell, KH Liu, JH Lee - Computers & Chemical Engineering, 2019 - Elsevier
… and time constraints are not important. In contrast, MPC seems best applied to continuous
tasks with complex, time… , and for which state and time constraints are very important. However, …

Crpo: A new approach for safe reinforcement learning with convergence guarantee

T Xu, Y Liang, G Lan - … Conference on Machine Learning, 2021 - proceedings.mlr.press
… is the one that maximizes the reward and at the same time satisfies the cost constraints. …
our algorithm that involves stochastic selection of a constraint if multiple constraints are violated. …