相关文章- 学术资源搜索

Reduced policy optimization for continuous control with hard constraints

S Ding, J Wang, Y Du, Y Shi - Advances in Neural …, 2024 - proceedings.neurips.cc

Recent advances in constrained reinforcement learning (RL) have endowed reinforcement
learning with certain safety guarantees. However, deploying existing constrained RL …

被引用次数：1 相关文章所有 8 个版本

[PDF] ifaamas.org

[PDF][PDF] Risk-Aware Constrained Reinforcement Learning with Non-Stationary Policies

Z Yang, H Jin, Y Tang, G Fan - … of the 23rd International Conference on …, 2024 - ifaamas.org

Constrained reinforcement learning (RL) algorithms have attracted extensive attentions
nowadays to tackle sequential decision-making problems that contain constraints defined …

被引用次数：1 相关文章所有 2 个版本

[PDF] arxiv.org

Constrained Reinforcement Learning with Smoothed Log Barrier Function

B Zhang, Y Zhang, L Frison, T Brox… - arXiv preprint arXiv …, 2024 - arxiv.org

Reinforcement Learning (RL) has been widely applied to many control tasks and
substantially improved the performances compared to conventional control methods in many …

被引用次数：1 相关文章所有 3 个版本

[PDF] arxiv.org

Feasible policy iteration

Y Yang, Z Zheng, SE Li, J Duan, J Liu, X Zhan… - arXiv preprint arXiv …, 2023 - arxiv.org

Safe reinforcement learning (RL) aims to find the optimal policy and its feasible region in a
constrained optimal control problem (OCP). Ensuring feasibility and optimality …

被引用次数：3 相关文章所有 3 个版本

[PDF] mlr.press

Escaping from zero gradient: Revisiting action-constrained reinforcement learning via Frank-Wolfe policy optimization

JL Lin, W Hung, SH Yang… - Uncertainty in Artificial …, 2021 - proceedings.mlr.press

Action-constrained reinforcement learning (RL) is a widely-used approach in various real-
world applications, such as scheduling in networked systems with resource constraints and …

被引用次数：13 相关文章所有 10 个版本

[PDF] arxiv.org

Constrained Reinforcement Learning Under Model Mismatch

Z Sun, S He, F Miao, S Zou - arXiv preprint arXiv:2405.01327, 2024 - arxiv.org

Existing studies on constrained reinforcement learning (RL) may obtain a well-performing
policy in the training environment. However, when deployed in a real environment, it may …

被引用次数：1 相关文章所有 4 个版本

[PDF] arxiv.org

Lyapunov-based safe policy optimization for continuous control

Y Chow, O Nachum, A Faust… - arXiv preprint arXiv …, 2019 - arxiv.org

We study continuous action reinforcement learning problems in which it is crucial that the
agent interacts with the environment only through safe policies, ie,~ policies that do not take …

被引用次数：258 相关文章所有 5 个版本

[PDF] neurips.cc

Iterative amortized policy optimization

J Marino, A Piché, AD Ialongo… - Advances in Neural …, 2021 - proceedings.neurips.cc

Policy networks are a central feature of deep reinforcement learning (RL) algorithms for
continuous control, enabling the estimation and sampling of high-value actions. From the …

被引用次数：22 相关文章所有 9 个版本

[PDF] mlr.press

Safe policy learning for continuous control

Y Chow, O Nachum, A Faust… - … on Robot Learning, 2021 - proceedings.mlr.press

We study continuous action reinforcement learning problems in which it is crucial that the
agent interacts with the environment only through near-safe policies, ie, policies that keep …

被引用次数：8 相关文章所有 6 个版本

CVaR-Constrained Policy Optimization for Safe Reinforcement Learning

Q Zhang, S Leng, X Ma, Q Liu, X Wang… - … on Neural Networks …, 2024 - ieeexplore.ieee.org

Current constrained reinforcement learning (RL) methods guarantee constraint satisfaction
only in expectation, which is inadequate for safety-critical decision problems. Since a …

被引用次数：2 相关文章所有 3 个版本

高级搜索

QQ 群

Reduced policy optimization for continuous control with hard constraints

[PDF][PDF] Risk-Aware Constrained Reinforcement Learning with Non-Stationary Policies

Constrained Reinforcement Learning with Smoothed Log Barrier Function

Feasible policy iteration

Escaping from zero gradient: Revisiting action-constrained reinforcement learning via Frank-Wolfe policy optimization

Constrained Reinforcement Learning Under Model Mismatch

Lyapunov-based safe policy optimization for continuous control

Iterative amortized policy optimization

Safe policy learning for continuous control

CVaR-Constrained Policy Optimization for Safe Reinforcement Learning

相关搜索

引用