A survey on offline reinforcement learning: Taxonomy, review, and open problems

S Guo, L Zou, H Chen, B Qu, H Chi… - … on Knowledge and …, 2023 - ieeexplore.ieee.org

Offline reinforcement learning (RL) makes it possible to train the agents entirely from a
previously collected dataset. However, constrained by the quality of the offline dataset …

被引用次数：2 相关文章所有 4 个版本

[PDF] mlr.press

Bayesian reparameterization of reward-conditioned reinforcement learning with energy-based models

W Ding, T Che, D Zhao… - … Conference on Machine …, 2023 - proceedings.mlr.press

Recently, reward-conditioned reinforcement learning (RCRL) has gained popularity due to
its simplicity, flexibility, and off-policy nature. However, we will show that current RCRL …

被引用次数：2 相关文章所有 6 个版本

[PDF] arxiv.org

ORL-AUDITOR: Dataset Auditing in Offline Deep Reinforcement Learning

L Du, M Chen, M Sun, S Ji, P Cheng, J Chen… - arXiv preprint arXiv …, 2023 - arxiv.org

Data is a critical asset in AI, as high-quality datasets can significantly improve the
performance of machine learning models. In safety-critical domains such as autonomous …

被引用次数：2 相关文章所有 4 个版本

[PDF] arxiv.org

An invitation to deep reinforcement learning

B Jaeger, A Geiger - arXiv preprint arXiv:2312.08365, 2023 - arxiv.org

Training a deep neural network to maximize a target objective has become the standard
recipe for successful machine learning over the last decade. These networks can be …

被引用次数：2 相关文章所有 5 个版本

[HTML] springer.com

[HTML][HTML] Offline reinforcement learning in high-dimensional stochastic environments

F Hêche, O Barakat, T Desmettre, T Marx… - Neural Computing and …, 2024 - Springer

Offline reinforcement learning (RL) has emerged as a promising paradigm for real-world
applications since it aims to train policies directly from datasets of past interactions with the …

被引用次数：2 相关文章所有 5 个版本

[PDF] arxiv.org

Robust offline policy evaluation and optimization with heavy-tailed rewards

J Zhu, R Wan, Z Qi, S Luo, C Shi - arXiv preprint arXiv:2310.18715, 2023 - arxiv.org

This paper endeavors to augment the robustness of offline reinforcement learning (RL) in
scenarios laden with heavy-tailed rewards, a prevalent circumstance in real-world …

被引用次数：2 相关文章所有 3 个版本

Reinforcement learning for blast furnace ironmaking operation with safety and partial observation considerations

K Jiang, Z Jiang, X Jiang, Y Xie… - IEEE Transactions on …, 2024 - ieeexplore.ieee.org

Making proper decision online in complex environment during the blast furnace (BF)
operation is a key factor in achieving long-term success and profitability in the steel …

被引用次数：1 相关文章所有 4 个版本

[PDF] arxiv.org

Iql-td-mpc: Implicit q-learning for hierarchical model predictive control

R Chitnis, Y Xu, B Hashemi, L Lehnert, U Dogan… - arXiv preprint arXiv …, 2023 - arxiv.org

Model-based reinforcement learning (RL) has shown great promise due to its sample
efficiency, but still struggles with long-horizon sparse-reward tasks, especially in offline …

被引用次数：5 相关文章所有 3 个版本

[PDF] arxiv.org

Learning from sparse offline datasets via conservative density estimation

Z Cen, Z Liu, Z Wang, Y Yao, H Lam, D Zhao - arXiv preprint arXiv …, 2024 - arxiv.org

Offline reinforcement learning (RL) offers a promising direction for learning policies from pre-
collected datasets without requiring further interactions with the environment. However …

被引用次数：2 相关文章所有 6 个版本

[PDF] wiley.com

Deep reinforcement learning for personalized treatment recommendation

M Liu, X Shen, W Pan - Statistics in medicine, 2022 - Wiley Online Library

In precision medicine, the ultimate goal is to recommend the most effective treatment to an
individual patient based on patient‐specific molecular and clinical profiles, possibly high …

被引用次数：23 相关文章所有 14 个版本

高级搜索

QQ 群