A review of safe reinforcement learning: Methods, theory and applications

S Gu, L Yang, Y Du, G Chen, F Walter, J Wang… - arXiv preprint arXiv …, 2022 - arxiv.org
Reinforcement learning (RL) has achieved tremendous success in many complex decision
making tasks. When it comes to deploying RL in the real world, safety concerns are usually …

Safe learning in robotics: From learning-based control to safe reinforcement learning

L Brunke, M Greeff, AW Hall, Z Yuan… - Annual Review of …, 2022 - annualreviews.org
The last half decade has seen a steep rise in the number of contributions on safe learning
methods for real-world robotic deployments from both the control and reinforcement learning …

End-to-end safe reinforcement learning through barrier functions for safety-critical continuous control tasks

R Cheng, G Orosz, RM Murray, JW Burdick - Proceedings of the AAAI …, 2019 - aaai.org
Reinforcement Learning (RL) algorithms have found limited success beyond simulated
applications, and one main reason is the absence of safety guarantees during the learning …

Challenges of real-world reinforcement learning: definitions, benchmarks and analysis

G Dulac-Arnold, N Levine, DJ Mankowitz, J Li… - Machine Learning, 2021 - Springer
Reinforcement learning (RL) has proven its worth in a series of artificial domains, and is
beginning to show some successes in real-world scenarios. However, much of the research …

Safe reinforcement learning in constrained markov decision processes

A Wachi, Y Sui - International Conference on Machine …, 2020 - proceedings.mlr.press
Safe reinforcement learning has been a promising approach for optimizing the policy of an
agent that operates in safety-critical applications. In this paper, we propose an algorithm …

[PDF][PDF] Policy learning with constraints in model-free reinforcement learning: A survey

Y Liu, A Halev, X Liu - The 30th international joint conference on artificial …, 2021 - par.nsf.gov
Reinforcement Learning (RL) algorithms have had tremendous success in simulated
domains. These algorithms, however, often cannot be directly applied to physical systems …

Provably efficient safe exploration via primal-dual policy optimization

D Ding, X Wei, Z Yang, Z Wang… - … conference on artificial …, 2021 - proceedings.mlr.press
We study the safe reinforcement learning problem using the constrained Markov decision
processes in which an agent aims to maximize the expected total reward subject to a safety …

Exploration-exploitation in constrained mdps

Y Efroni, S Mannor, M Pirotta - arXiv preprint arXiv:2003.02189, 2020 - arxiv.org
In many sequential decision-making problems, the goal is to optimize a utility function while
satisfying a set of constraints on different utilities. This learning problem is formalized …

Enforcing hard constraints with soft barriers: Safe reinforcement learning in unknown stochastic environments

Y Wang, SS Zhan, R Jiao, Z Wang… - International …, 2023 - proceedings.mlr.press
It is quite challenging to ensure the safety of reinforcement learning (RL) agents in an
unknown and stochastic environment under hard constraints that require the system state …

Learning policies with zero or bounded constraint violation for constrained mdps

T Liu, R Zhou, D Kalathil, P Kumar… - Advances in Neural …, 2021 - proceedings.neurips.cc
We address the issue of safety in reinforcement learning. We pose the problem in an
episodic framework of a constrained Markov decision process. Existing results have shown …