Empirical study of off-policy policy evaluation for reinforcement learning

C Voloshin, HM Le, N Jiang, Y Yue - arXiv preprint arXiv:1911.06854, 2019 - arxiv.org
We offer an experimental benchmark and empirical study for off-policy policy evaluation
(OPE) in reinforcement learning, which is a key problem in many safety critical applications …

[PDF][PDF] Empirical Study of Off-Policy Policy Evaluation for Reinforcement Learning

C Voloshin, HM Le, AI Argo, N Jiang, Y Yue - datasets-benchmarks-proceedings …
We offer an experimental benchmark and empirical study for off-policy policy evaluation
(OPE) in reinforcement learning, which is a key problem in many safety critical applications …

Empirical Study of Off-Policy Policy Evaluation for Reinforcement Learning

C Voloshin, HM Le, N Jiang, Y Yue - Thirty-fifth Conference on Neural … - openreview.net
We offer an experimental benchmark and empirical study for off-policy policy evaluation
(OPE) in reinforcement learning, which is a key problem in many safety critical applications …

[PDF][PDF] Empirical Study of Off-Policy Policy Evaluation for Reinforcement Learning

C Voloshin, HM Le, N Jiang, Y Yue - a∈ A - core.ac.uk
Off-policy policy evaluation (OPE) is the problem of estimating the online performance of a
policy using only pre-collected historical data generated by another policy. Given the …

[PDF][PDF] Empirical Study of Off-Policy Policy Evaluation for Reinforcement Learning

C Voloshin, HM Le, AI Argo, N Jiang, Y Yue - cvoloshin.com
We offer an experimental benchmark and empirical study for off-policy policy evaluation
(OPE) in reinforcement learning, which is a key problem in many safety critical applications …

[PDF][PDF] Empirical Study of Off-Policy Policy Evaluation for Reinforcement Learning

C Voloshin, HM Le, N Jiang, Y Yue - a∈ A - authors.library.caltech.edu
Off-policy policy evaluation (OPE) is the problem of estimating the online performance of a
policy using only pre-collected historical data generated by another policy. Given the …

[PDF][PDF] Empirical Study of Off-Policy Policy Evaluation for Reinforcement Learning

C Voloshin, HM Le, AI Argo, N Jiang, Y Yue - cameronvoloshin.com
We offer an experimental benchmark and empirical study for off-policy policy evaluation
(OPE) in reinforcement learning, which is a key problem in many safety critical applications …

[PDF][PDF] Empirical Study of Off-Policy Policy Evaluation for Reinforcement Learning

C Voloshin, HM Le, AI Argo, N Jiang, Y Yue - clvoloshin.com
We offer an experimental benchmark and empirical study for off-policy policy evaluation
(OPE) in reinforcement learning, which is a key problem in many safety critical applications …

Empirical Study of Off-Policy Policy Evaluation for Reinforcement Learning

C Voloshin, HM Le, N Jiang, Y Yue - arXiv e-prints, 2019 - ui.adsabs.harvard.edu
We offer an experimental benchmark and empirical study for off-policy policy evaluation
(OPE) in reinforcement learning, which is a key problem in many safety critical applications …

[PDF][PDF] Empirical Study of Off-Policy Policy Evaluation for Reinforcement Learning

C Voloshin, HM Le, AI Argo, N Jiang, Y Yue - datasets-benchmarks-proceedings …
We offer an experimental benchmark and empirical study for off-policy policy evaluation
(OPE) in reinforcement learning, which is a key problem in many safety critical applications …