作者
Cameron Voloshin, Hoang M Le, Yisong Yue
发表日期
2019
期刊
Real-world Sequential Decision Making Workshop at ICML
卷号
2019
简介
Off-policy policy evaluation (OPE) is the task of predicting the online performance of a policy using only pre-collected historical data (collected from an existing deployed policy or set of policies). For many real-world applications, accurate OPE is crucial since deploying bad policies can be prohibitively costly or dangerous. With the increasing interest in deploying learning-based methods for safety-critical applications, the study of OPE has also become correspondingly more important. In this paper, we present the first comprehensive empirical analysis of most of the recently proposed OPE methods. Based on thousands of experiments and detailed empirical analyses, we offer a summarized set of guidelines for effectively using OPE in practice, as well as suggest directions for future research to address current limitations.
引用总数
20202021202220231112
学术搜索中的文章
C Voloshin, HM Le, Y Yue - Real-world Sequential Decision Making Workshop at …, 2019