A review of safe reinforcement learning: Methods, theory and applications

S Gu, L Yang, Y Du, G Chen, F Walter, J Wang… - arXiv preprint arXiv …, 2022 - arxiv.org
Reinforcement Learning (RL) has achieved tremendous success in many complex decision-
making tasks. However, safety concerns are raised during deploying RL in real-world …

Last-iterate convergent policy gradient primal-dual methods for constrained mdps

D Ding, CY Wei, K Zhang… - Advances in Neural …, 2024 - proceedings.neurips.cc
We study the problem of computing an optimal policy of an infinite-horizon discounted
constrained Markov decision process (constrained MDP). Despite the popularity of …

Probabilistic constraint for safety-critical reinforcement learning

W Chen, D Subramanian… - IEEE Transactions on …, 2024 - ieeexplore.ieee.org
In this paper, we consider the problem of learning safe policies for probabilistic-constrained
reinforcement learning (RL). Specifically, a safe policy or controller is one that, with high …

Safe pontryagin differentiable programming

W Jin, S Mou, GJ Pappas - Advances in Neural Information …, 2021 - proceedings.neurips.cc
Abstract We propose a Safe Pontryagin Differentiable Programming (Safe PDP)
methodology, which establishes a theoretical and algorithmic framework to solve a broad …

A Review of Safe Reinforcement Learning: Methods, Theories and Applications

S Gu, L Yang, Y Du, G Chen, F Walter… - … on Pattern Analysis …, 2024 - ieeexplore.ieee.org
Reinforcement Learning (RL) has achieved tremendous success in many complex decision-
making tasks. However, safety concerns are raised during deploying RL in real-world …

State-augmented learnable algorithms for resource management in wireless networks

N NaderiAlizadeh, M Eisen… - IEEE Transactions on …, 2022 - ieeexplore.ieee.org
We consider resource management problems in multi-user wireless networks, which can be
cast as optimizing a network-wide utility function, subject to constraints on the long-term …

Resilient Constrained Reinforcement Learning

D Ding, Z Huan, A Ribeiro - International Conference on …, 2024 - proceedings.mlr.press
We study a class of constrained reinforcement learning (RL) problems in which multiple
constraint specifications are not identified before training. It is challenging to identify …

Deterministic Policy Gradient Primal-Dual Methods for Continuous-Space Constrained MDPs

S Rozada, D Ding, AG Marques, A Ribeiro - arXiv preprint arXiv …, 2024 - arxiv.org
We study the problem of computing deterministic optimal policies for constrained Markov
decision processes (MDPs) with continuous state and action spaces, which are widely …

Towards Cooperative Driving among Heterogeneous CAVs: A Safe Multi-Agent Reinforcement Learning Approach

Y Pan, J Lei, P Yi, L Guo, H Chen - IEEE Transactions on …, 2024 - ieeexplore.ieee.org
With the advancement of Intelligent Transportation Systems and Vehicle-to-Everything
communication technologies, the future traffic scenario is anticipated to be a mixed …

SCPO: Safe Reinforcement Learning with Safety Critic Policy Optimization

J Mhamed, S Gu - arXiv preprint arXiv:2311.00880, 2023 - arxiv.org
Incorporating safety is an essential prerequisite for broadening the practical applications of
reinforcement learning in real-world scenarios. To tackle this challenge, Constrained Markov …