Empirical study of off-policy policy evaluation for reinforcement learning

RF Prudencio, MROA Maximo… - IEEE Transactions on …, 2023 - ieeexplore.ieee.org

With the widespread adoption of deep learning, reinforcement learning (RL) has
experienced a dramatic increase in popularity, scaling to previously intractable problems …

被引用次数：219 相关文章所有 9 个版本

[PDF] neurips.cc

Offline rl without off-policy evaluation

D Brandfonbrener, W Whitney… - Advances in neural …, 2021 - proceedings.neurips.cc

Most prior approaches to offline reinforcement learning (RL) have taken an iterative actor-
critic approach involving off-policy evaluation. In this paper we show that simply doing one …

被引用次数：150 相关文章所有 10 个版本

[PDF] arxiv.org

Hyperparameter selection for offline reinforcement learning

TL Paine, C Paduraru, A Michi, C Gulcehre… - arXiv preprint arXiv …, 2020 - arxiv.org

Offline reinforcement learning (RL purely from logged data) is an important avenue for
deploying RL techniques in real-world scenarios. However, existing hyperparameter …

被引用次数：149 相关文章所有 2 个版本

[PDF] sciencedirect.com

Ten questions concerning reinforcement learning for building energy management

Z Nagy, G Henze, S Dey, J Arroyo, L Helsen… - Building and …, 2023 - Elsevier

As buildings account for approximately 40% of global energy consumption and associated
greenhouse gas emissions, their role in decarbonizing the power grid is crucial. The …

被引用次数：34 相关文章所有 6 个版本

[PDF] neurips.cc

Rl unplugged: A suite of benchmarks for offline reinforcement learning

C Gulcehre, Z Wang, A Novikov… - Advances in …, 2020 - proceedings.neurips.cc

Offline methods for reinforcement learning have a potential to help bridge the gap between
reinforcement learning research and real-world applications. They make it possible to learn …

被引用次数：165 相关文章所有 8 个版本

[PDF] mlr.press

Model selection for offline reinforcement learning: Practical considerations for healthcare settings

S Tang, J Wiens - Machine Learning for Healthcare …, 2021 - proceedings.mlr.press

Reinforcement learning (RL) can be used to learn treatment policies and aid decision
making in healthcare. However, given the need for generalization over complex state/action …

被引用次数：72 相关文章所有 9 个版本

[PDF] arxiv.org

A workflow for offline model-free robotic reinforcement learning

A Kumar, A Singh, S Tian, C Finn, S Levine - arXiv preprint arXiv …, 2021 - arxiv.org

Offline reinforcement learning (RL) enables learning control policies by utilizing only prior
experience, without any online interaction. This can allow robots to acquire generalizable …

被引用次数：84 相关文章所有 6 个版本

[PDF] arxiv.org

Reinforcement learning in practice: Opportunities and challenges

Y Li - arXiv preprint arXiv:2202.11296, 2022 - arxiv.org

This article is a gentle discussion about the field of reinforcement learning in practice, about
opportunities and challenges, touching a broad range of topics, with perspectives and …

被引用次数：15 相关文章所有 2 个版本

[PDF] mlr.press

Off-policy evaluation for large action spaces via conjunct effect modeling

Y Saito, Q Ren, T Joachims - international conference on …, 2023 - proceedings.mlr.press

We study off-policy evaluation (OPE) of contextual bandit policies for large discrete action
spaces where conventional importance-weighting approaches suffer from excessive …

被引用次数：15 相关文章所有 8 个版本

[PDF] arxiv.org

Multi-task fusion via reinforcement learning for long-term user satisfaction in recommender systems

Q Zhang, J Liu, Y Dai, Y Qi, Y Yuan, K Zheng… - Proceedings of the 28th …, 2022 - dl.acm.org

Recommender System (RS) is an important online application that affects billions of users
every day. The mainstream RS ranking framework is composed of two parts: a Multi-Task …

被引用次数：39 相关文章所有 3 个版本

高级搜索

QQ 群