C Voloshin, HM Le, AI Argo, N Jiang, Y Yue - datasets-benchmarks-proceedings …
We offer an experimental benchmark and empirical study for off-policy policy evaluation
(OPE) in reinforcement learning, which is a key problem in many safety critical applications …