Imitating human demonstrations is a promising approach to endow robots with various manipulation capabilities. While recent advances have been made in imitation learning and …
T Yamagata, A Khalil… - … on Machine Learning, 2023 - proceedings.mlr.press
Recent works have shown that tackling offline reinforcement learning (RL) with a conditional policy produces promising results. The Decision Transformer (DT) combines the conditional …
M Yin, YX Wang - Advances in neural information …, 2021 - proceedings.neurips.cc
We study the\emph {offline reinforcement learning}(offline RL) problem, where the goal is to learn a reward-maximizing policy in an unknown\emph {Markov Decision Process}(MDP) …
S Tang, J Wiens - Machine Learning for Healthcare …, 2021 - proceedings.mlr.press
Reinforcement learning (RL) can be used to learn treatment policies and aid decision making in healthcare. However, given the need for generalization over complex state/action …
In reinforcement learning (RL), representation learning is a proven tool for complex image- based tasks, but is often overlooked for environments with low-level states, such as physical …
H Xu, X Zhan, H Yin, H Qin - International Conference on …, 2022 - proceedings.mlr.press
We study the problem of offline Imitation Learning (IL) where an agent aims to learn an optimal expert behavior policy without additional online environment interactions. Instead …
J Wu, H Wu, Z Qiu, J Wang… - Advances in Neural …, 2022 - proceedings.neurips.cc
Policy constraint methods to offline reinforcement learning (RL) typically utilize parameterization or regularization that constrains the policy to perform actions within the …
Y Li - arXiv preprint arXiv:2202.11296, 2022 - arxiv.org
This article is a gentle discussion about the field of reinforcement learning in practice, about opportunities and challenges, touching a broad range of topics, with perspectives and …
RJ Qin, X Zhang, S Gao, XH Chen… - Advances in …, 2022 - proceedings.neurips.cc
Offline reinforcement learning (RL) aims at learning effective policies from historical data without extra environment interactions. During our experience of applying offline RL, we …