Constrained markov decision processes via backward value functions

S Gu, L Yang, Y Du, G Chen, F Walter, J Wang… - arXiv preprint arXiv …, 2022 - arxiv.org

Reinforcement Learning (RL) has achieved tremendous success in many complex decision-
making tasks. However, safety concerns are raised during deploying RL in real-world …

被引用次数：235 相关文章所有 2 个版本

[PDF] arxiv.org

Safe learning in robotics: From learning-based control to safe reinforcement learning

L Brunke, M Greeff, AW Hall, Z Yuan… - Annual Review of …, 2022 - annualreviews.org

The last half decade has seen a steep rise in the number of contributions on safe learning
methods for real-world robotic deployments from both the control and reinforcement learning …

被引用次数：568 相关文章所有 9 个版本

[PDF] springer.com

Challenges of real-world reinforcement learning: definitions, benchmarks and analysis

G Dulac-Arnold, N Levine, DJ Mankowitz, J Li… - Machine Learning, 2021 - Springer

Reinforcement learning (RL) has proven its worth in a series of artificial domains, and is
beginning to show some successes in real-world scenarios. However, much of the research …

被引用次数：451 相关文章所有 6 个版本

[PDF] nsf.gov

[PDF][PDF] Policy learning with constraints in model-free reinforcement learning: A survey

Y Liu, A Halev, X Liu - The 30th international joint conference on artificial …, 2021 - par.nsf.gov

Reinforcement Learning (RL) algorithms have had tremendous success in simulated
domains. These algorithms, however, often cannot be directly applied to physical systems …

被引用次数：117 相关文章所有 6 个版本

[PDF] neurips.cc

Constrained update projection approach to safe policy optimization

L Yang, J Ji, J Dai, L Zhang, B Zhou… - Advances in …, 2022 - proceedings.neurips.cc

Safe reinforcement learning (RL) studies problems where an intelligent agent has to not only
maximize reward but also avoid exploring unsafe areas. In this study, we propose CUP, a …

被引用次数：36 相关文章所有 9 个版本

[PDF] arxiv.org

An empirical investigation of the challenges of real-world reinforcement learning

G Dulac-Arnold, N Levine, DJ Mankowitz, J Li… - arXiv preprint arXiv …, 2020 - arxiv.org

Reinforcement learning (RL) has proven its worth in a series of artificial domains, and is
beginning to show some successes in real-world scenarios. However, much of the research …

被引用次数：132 相关文章所有 3 个版本

[PDF] aaai.org

Constraints penalized q-learning for safe offline reinforcement learning

H Xu, X Zhan, X Zhu - Proceedings of the AAAI Conference on Artificial …, 2022 - ojs.aaai.org

We study the problem of safe offline reinforcement learning (RL), the goal is to learn a policy
that maximizes long-term reward while satisfying safety constraints given only offline data …

被引用次数：67 相关文章所有 8 个版本

[PDF] arxiv.org

Convergence and sample complexity of natural policy gradient primal-dual methods for constrained MDPs

D Ding, K Zhang, J Duan, T Başar… - arXiv preprint arXiv …, 2022 - arxiv.org

We study sequential decision making problems aimed at maximizing the expected total
reward while satisfying a constraint on the expected total utility. We employ the natural policy …

被引用次数：26 相关文章所有 2 个版本

[PDF] aaai.org

Learning with safety constraints: Sample complexity of reinforcement learning for constrained mdps

A HasanzadeZonuzy, A Bura, D Kalathil… - Proceedings of the …, 2021 - ojs.aaai.org

Many physical systems have underlying safety considerations that require that the policy
employed ensures the satisfaction of a set of constraints. The analytical formulation usually …

被引用次数：47 相关文章所有 7 个版本

[PDF] arxiv.org

Constrained model-free reinforcement learning for process optimization

E Pan, P Petsagkourakis, M Mowbray, D Zhang… - Computers & Chemical …, 2021 - Elsevier

Reinforcement learning (RL) is a control approach that can handle nonlinear stochastic
optimal control problems. However, despite the promise exhibited, RL has yet to see marked …

被引用次数：42 相关文章所有 6 个版本

高级搜索

QQ 群