… Offline RL is a paradigm that learns exclusively from static … Effective offline RL algorithms have a much wider range of … a unifying taxonomy to classify offline RL methods. Furthermore, …
… reinforcementlearning (RL) using a fixed offline dataset of logged interactions is an important consideration in real world applications. This paper studies offline … in the offline setting, we …
… outperforms existing offline RL methods, often learning policies that attain 2-5 times higher final return, especially when learning from complex and multi-modal data distributions. …
S Fujimoto, SS Gu - Advances in neural information …, 2021 - proceedings.neurips.cc
… in offline RL. In this paper, we ask: can we make a deep RL algorithm work offline with minimal changes? We find that we can match the performance of state-of-the-art offline RL …
R Kidambi, A Rajeswaran… - Advances in neural …, 2020 - proceedings.neurips.cc
… offline RL, which allows for data driven policy learning using pre-collected datasets. The ability to train policies offline can … online learning. Since the dataset has already been collected, …
… for data efficient learning and offline RL, leading to MuZero Unplugged. We demonstrate its effectiveness for the online case through results on Atari and for the offline case through …
M Janner, Q Li, S Levine - Advances in neural information …, 2021 - proceedings.neurips.cc
… -scale language modeling instead of those normally associated with control, we find that this approach is effective in imitation learning, goal-reaching, and offlinereinforcementlearning. …
… In this section, we discuss how our approach is related to prior work on offline reinforcementlearning. In particular, we discuss connections to BCQ Fujimoto et al. (2019). …
Y Wu, G Tucker, O Nachum - arXiv preprint arXiv:1911.11361, 2019 - arxiv.org
… fails to learn a good policy (eg, in Walker2d) in the offline setting. Using k = 4 has a small advantage compared to k = 2 except in Hopper. Both k = 2 and k = 4 significantly improve …