Benchmarks for deep off-policy evaluation

RF Prudencio, MROA Maximo… - IEEE Transactions on …, 2023 - ieeexplore.ieee.org

With the widespread adoption of deep learning, reinforcement learning (RL) has
experienced a dramatic increase in popularity, scaling to previously intractable problems …

被引用次数：255 相关文章所有 9 个版本

[PDF] openreview.net

What matters in learning from offline human demonstrations for robot manipulation

A Mandlekar, D Xu, J Wong, S Nasiriany… - arXiv preprint arXiv …, 2021 - arxiv.org

Imitating human demonstrations is a promising approach to endow robots with various
manipulation capabilities. While recent advances have been made in imitation learning and …

被引用次数：325 相关文章所有 4 个版本

[PDF] mlr.press

Q-learning decision transformer: Leveraging dynamic programming for conditional sequence modelling in offline rl

T Yamagata, A Khalil… - … on Machine Learning, 2023 - proceedings.mlr.press

Recent works have shown that tackling offline reinforcement learning (RL) with a conditional
policy produces promising results. The Decision Transformer (DT) combines the conditional …

被引用次数：58 相关文章所有 9 个版本

[PDF] neurips.cc

Towards instance-optimal offline reinforcement learning with pessimism

M Yin, YX Wang - Advances in neural information …, 2021 - proceedings.neurips.cc

We study the\emph {offline reinforcement learning}(offline RL) problem, where the goal is to
learn a reward-maximizing policy in an unknown\emph {Markov Decision Process}(MDP) …

被引用次数：80 相关文章所有 7 个版本

[PDF] mlr.press

Model selection for offline reinforcement learning: Practical considerations for healthcare settings

S Tang, J Wiens - Machine Learning for Healthcare …, 2021 - proceedings.mlr.press

Reinforcement learning (RL) can be used to learn treatment policies and aid decision
making in healthcare. However, given the need for generalization over complex state/action …

被引用次数：80 相关文章所有 9 个版本

[PDF] neurips.cc

For sale: State-action representation learning for deep reinforcement learning

S Fujimoto, WD Chang, E Smith… - Advances in …, 2024 - proceedings.neurips.cc

In reinforcement learning (RL), representation learning is a proven tool for complex image-
based tasks, but is often overlooked for environments with low-level states, such as physical …

被引用次数：27 相关文章所有 5 个版本

[PDF] mlr.press

Discriminator-weighted offline imitation learning from suboptimal demonstrations

H Xu, X Zhan, H Yin, H Qin - International Conference on …, 2022 - proceedings.mlr.press

We study the problem of offline Imitation Learning (IL) where an agent aims to learn an
optimal expert behavior policy without additional online environment interactions. Instead …

被引用次数：63 相关文章所有 10 个版本

[PDF] neurips.cc

Supported policy optimization for offline reinforcement learning

J Wu, H Wu, Z Qiu, J Wang… - Advances in Neural …, 2022 - proceedings.neurips.cc

Policy constraint methods to offline reinforcement learning (RL) typically utilize
parameterization or regularization that constrains the policy to perform actions within the …

被引用次数：51 相关文章所有 9 个版本

[PDF] arxiv.org

Reinforcement learning in practice: Opportunities and challenges

Y Li - arXiv preprint arXiv:2202.11296, 2022 - arxiv.org

This article is a gentle discussion about the field of reinforcement learning in practice, about
opportunities and challenges, touching a broad range of topics, with perspectives and …

被引用次数：18 相关文章所有 2 个版本

[PDF] neurips.cc

NeoRL: A near real-world benchmark for offline reinforcement learning

RJ Qin, X Zhang, S Gao, XH Chen… - Advances in …, 2022 - proceedings.neurips.cc

Offline reinforcement learning (RL) aims at learning effective policies from historical data
without extra environment interactions. During our experience of applying offline RL, we …

被引用次数：74 相关文章所有 6 个版本

高级搜索

QQ 群