Uac: Offline reinforcement learning with uncertain action constraint

J Guan, G Chen, J Ji, L Yang… - Advances in Neural …, 2024 - proceedings.neurips.cc

Offline safe reinforcement learning (RL) algorithms promise to learn policies that satisfy
safety constraints directly in offline datasets without interacting with the environment. This …

被引用次数：7 相关文章所有 4 个版本

[PDF] thecvf.com

POCE: Primal Policy Optimization with Conservative Estimation for Multi-constraint Offline Reinforcement Learning

J Guan, L Shen, A Zhou, L Li, H Hu… - Proceedings of the …, 2024 - openaccess.thecvf.com

Multi-constraint offline reinforcement learning (RL) promises to learn policies that satisfy
both cumulative and state-wise costs from offline datasets. This arrangement provides an …

被引用次数：2 相关文章所有 2 个版本

[PDF] arxiv.org

Deep generative models for offline policy learning: Tutorial, survey, and perspectives on future directions

J Chen, B Ganguly, Y Xu, Y Mei, T Lan… - arXiv preprint arXiv …, 2024 - arxiv.org

Deep generative models (DGMs) have demonstrated great success across various domains,
particularly in generating texts, images, and videos using models trained from offline data …

被引用次数：4 相关文章所有 3 个版本

Hybrid residual multiexpert reinforcement learning for spatial scheduling of high-density parking lots

J Hou, G Chen, Z Li, W He, S Gu… - IEEE transactions on …, 2023 - ieeexplore.ieee.org

Industries, such as manufacturing, are accelerating their embrace of the metaverse to
achieve higher productivity, especially in complex industrial scheduling. In view of the …

被引用次数：5 相关文章所有 5 个版本

Offline Reinforcement Learning With Behavior Value Regularization

L Huang, B Dong, W Xie… - IEEE Transactions on …, 2024 - ieeexplore.ieee.org

Offline reinforcement learning (offline RL) aims to find task-solving policies from prerecorded
datasets without online environment interaction. It is unfortunate that extrapolation errors can …

被引用次数：2 相关文章所有 3 个版本

[PDF] arxiv.org

Preferred-Action-Optimized Diffusion Policies for Offline Reinforcement Learning

T Zhang, J Guan, L Zhao, Y Li, D Li, Z Zeng… - arXiv preprint arXiv …, 2024 - arxiv.org

Offline reinforcement learning (RL) aims to learn optimal policies from previously collected
datasets. Recently, due to their powerful representational capabilities, diffusion models have …

高级搜索

QQ 群