Coptidice: Offline constrained reinforcement learning via stationary distribution correction...

Y Cao, H Zhao, Y Cheng, T Shu, Y Chen… - … on Neural Networks …, 2024 - ieeexplore.ieee.org

With extensive pretrained knowledge and high-level general capabilities, large language
models (LLMs) emerge as a promising avenue to augment reinforcement learning (RL) in …

被引用次数：33 相关文章所有 2 个版本

[PDF] jmlr.org

Omnisafe: An infrastructure for accelerating safe reinforcement learning research

J Ji, J Zhou, B Zhang, J Dai, X Pan, R Sun… - Journal of Machine …, 2024 - jmlr.org

AI systems empowered by reinforcement learning (RL) algorithms harbor the immense
potential to catalyze societal advancement, yet their deployment is often impeded by …

被引用次数：39 相关文章所有 4 个版本

[PDF] mlr.press

Constrained decision transformer for offline safe reinforcement learning

Z Liu, Z Guo, Y Yao, Z Cen, W Yu… - International …, 2023 - proceedings.mlr.press

Safe reinforcement learning (RL) trains a constraint satisfaction policy by interacting with the
environment. We aim to tackle a more challenging problem: learning a safe policy from an …

被引用次数：62 相关文章所有 7 个版本

[PDF] neurips.cc

VOCE: Variational optimization with conservative estimation for offline safe reinforcement learning

J Guan, G Chen, J Ji, L Yang… - Advances in Neural …, 2024 - proceedings.neurips.cc

Offline safe reinforcement learning (RL) algorithms promise to learn policies that satisfy
safety constraints directly in offline datasets without interacting with the environment. This …

被引用次数：11 相关文章所有 4 个版本

A Survey on Recent Advancements in Autonomous Driving Using Deep Reinforcement Learning: Applications, Challenges, and Solutions

R Zhao, Y Li, Y Fan, F Gao… - IEEE Transactions on …, 2024 - ieeexplore.ieee.org

Autonomous driving (AD) endows vehicles with the capability to drive partly or entirely
without human intervention. AD agents generate driving policies based on online perception …

被引用次数：2 相关文章所有 2 个版本

[PDF] neurips.cc

Survival instinct in offline reinforcement learning

A Li, D Misra, A Kolobov… - Advances in neural …, 2024 - proceedings.neurips.cc

We present a novel observation about the behavior of offline reinforcement learning (RL)
algorithms: on many benchmark datasets, offline RL can produce well-performing and safe …

被引用次数：19 相关文章所有 5 个版本

[PDF] arxiv.org

Datasets and benchmarks for offline safe reinforcement learning

Z Liu, Z Guo, H Lin, Y Yao, J Zhu, Z Cen, H Hu… - arXiv preprint arXiv …, 2023 - arxiv.org

This paper presents a comprehensive benchmarking suite tailored to offline safe
reinforcement learning (RL) challenges, aiming to foster progress in the development and …

被引用次数：37 相关文章所有 2 个版本

[PDF] arxiv.org

Safe offline reinforcement learning with feasibility-guided diffusion model

Y Zheng, J Li, D Yu, Y Yang, SE Li, X Zhan… - arXiv preprint arXiv …, 2024 - arxiv.org

Safe offline RL is a promising way to bypass risky online interactions towards safe policy
learning. Most existing methods only enforce soft constraints, ie, constraining safety …

被引用次数：24 相关文章所有 4 个版本

[PDF] arxiv.org

How Far I'll Go: Offline Goal-Conditioned Reinforcement Learning via -Advantage Regression

YJ Ma, J Yan, D Jayaraman, O Bastani - arXiv preprint arXiv:2206.03023, 2022 - arxiv.org

Offline goal-conditioned reinforcement learning (GCRL) promises general-purpose skill
learning in the form of reaching diverse goals from purely offline datasets. We propose …

被引用次数：30 相关文章所有 4 个版本

[PDF] neurips.cc

Tempo adaptation in non-stationary reinforcement learning

H Lee, Y Ding, J Lee, M Jin… - Advances in Neural …, 2024 - proceedings.neurips.cc

We first raise and tackle a``time synchronization''issue between the agent and the
environment in non-stationary reinforcement learning (RL), a crucial factor hindering its real …

被引用次数：3 相关文章所有 8 个版本

高级搜索

QQ 群