A survey on offline reinforcement learning: Taxonomy, review, and open problems

RF Prudencio, MROA Maximo… - IEEE Transactions on …, 2023 - ieeexplore.ieee.org
With the widespread adoption of deep learning, reinforcement learning (RL) has
experienced a dramatic increase in popularity, scaling to previously intractable problems …

What matters in learning from offline human demonstrations for robot manipulation

A Mandlekar, D Xu, J Wong, S Nasiriany… - arXiv preprint arXiv …, 2021 - arxiv.org
Imitating human demonstrations is a promising approach to endow robots with various
manipulation capabilities. While recent advances have been made in imitation learning and …

Q-learning decision transformer: Leveraging dynamic programming for conditional sequence modelling in offline rl

T Yamagata, A Khalil… - … on Machine Learning, 2023 - proceedings.mlr.press
Recent works have shown that tackling offline reinforcement learning (RL) with a conditional
policy produces promising results. The Decision Transformer (DT) combines the conditional …

Towards instance-optimal offline reinforcement learning with pessimism

M Yin, YX Wang - Advances in neural information …, 2021 - proceedings.neurips.cc
We study the\emph {offline reinforcement learning}(offline RL) problem, where the goal is to
learn a reward-maximizing policy in an unknown\emph {Markov Decision Process}(MDP) …

Model selection for offline reinforcement learning: Practical considerations for healthcare settings

S Tang, J Wiens - Machine Learning for Healthcare …, 2021 - proceedings.mlr.press
Reinforcement learning (RL) can be used to learn treatment policies and aid decision
making in healthcare. However, given the need for generalization over complex state/action …

For sale: State-action representation learning for deep reinforcement learning

S Fujimoto, WD Chang, E Smith… - Advances in …, 2024 - proceedings.neurips.cc
In reinforcement learning (RL), representation learning is a proven tool for complex image-
based tasks, but is often overlooked for environments with low-level states, such as physical …

Discriminator-weighted offline imitation learning from suboptimal demonstrations

H Xu, X Zhan, H Yin, H Qin - International Conference on …, 2022 - proceedings.mlr.press
We study the problem of offline Imitation Learning (IL) where an agent aims to learn an
optimal expert behavior policy without additional online environment interactions. Instead …

Supported policy optimization for offline reinforcement learning

J Wu, H Wu, Z Qiu, J Wang… - Advances in Neural …, 2022 - proceedings.neurips.cc
Policy constraint methods to offline reinforcement learning (RL) typically utilize
parameterization or regularization that constrains the policy to perform actions within the …

Reinforcement learning in practice: Opportunities and challenges

Y Li - arXiv preprint arXiv:2202.11296, 2022 - arxiv.org
This article is a gentle discussion about the field of reinforcement learning in practice, about
opportunities and challenges, touching a broad range of topics, with perspectives and …

NeoRL: A near real-world benchmark for offline reinforcement learning

RJ Qin, X Zhang, S Gao, XH Chen… - Advances in …, 2022 - proceedings.neurips.cc
Offline reinforcement learning (RL) aims at learning effective policies from historical data
without extra environment interactions. During our experience of applying offline RL, we …