A survey on offline reinforcement learning: Taxonomy, review, and open problems

RF Prudencio, MROA Maximo… - IEEE Transactions on …, 2023 - ieeexplore.ieee.org
With the widespread adoption of deep learning, reinforcement learning (RL) has
experienced a dramatic increase in popularity, scaling to previously intractable problems …

Offline rl without off-policy evaluation

D Brandfonbrener, W Whitney… - Advances in neural …, 2021 - proceedings.neurips.cc
Most prior approaches to offline reinforcement learning (RL) have taken an iterative actor-
critic approach involving off-policy evaluation. In this paper we show that simply doing one …

Hyperparameter selection for offline reinforcement learning

TL Paine, C Paduraru, A Michi, C Gulcehre… - arXiv preprint arXiv …, 2020 - arxiv.org
Offline reinforcement learning (RL purely from logged data) is an important avenue for
deploying RL techniques in real-world scenarios. However, existing hyperparameter …

Ten questions concerning reinforcement learning for building energy management

Z Nagy, G Henze, S Dey, J Arroyo, L Helsen… - Building and …, 2023 - Elsevier
As buildings account for approximately 40% of global energy consumption and associated
greenhouse gas emissions, their role in decarbonizing the power grid is crucial. The …

Rl unplugged: A suite of benchmarks for offline reinforcement learning

C Gulcehre, Z Wang, A Novikov… - Advances in …, 2020 - proceedings.neurips.cc
Offline methods for reinforcement learning have a potential to help bridge the gap between
reinforcement learning research and real-world applications. They make it possible to learn …

Model selection for offline reinforcement learning: Practical considerations for healthcare settings

S Tang, J Wiens - Machine Learning for Healthcare …, 2021 - proceedings.mlr.press
Reinforcement learning (RL) can be used to learn treatment policies and aid decision
making in healthcare. However, given the need for generalization over complex state/action …

A workflow for offline model-free robotic reinforcement learning

A Kumar, A Singh, S Tian, C Finn, S Levine - arXiv preprint arXiv …, 2021 - arxiv.org
Offline reinforcement learning (RL) enables learning control policies by utilizing only prior
experience, without any online interaction. This can allow robots to acquire generalizable …

Reinforcement learning in practice: Opportunities and challenges

Y Li - arXiv preprint arXiv:2202.11296, 2022 - arxiv.org
This article is a gentle discussion about the field of reinforcement learning in practice, about
opportunities and challenges, touching a broad range of topics, with perspectives and …

Off-policy evaluation for large action spaces via conjunct effect modeling

Y Saito, Q Ren, T Joachims - international conference on …, 2023 - proceedings.mlr.press
We study off-policy evaluation (OPE) of contextual bandit policies for large discrete action
spaces where conventional importance-weighting approaches suffer from excessive …

Multi-task fusion via reinforcement learning for long-term user satisfaction in recommender systems

Q Zhang, J Liu, Y Dai, Y Qi, Y Yuan, K Zheng… - Proceedings of the 28th …, 2022 - dl.acm.org
Recommender System (RS) is an important online application that affects billions of users
every day. The mainstream RS ranking framework is composed of two parts: a Multi-Task …