P Thomas, E Brunskill - … Conference on Machine Learning, 2016 - proceedings.mlr.press
In this paper we present a new way of predicting the performance of a reinforcementlearning policy given historical data that may have been generated by a different policy. The ability …
… Further interrogation of the state-action value function will be made later when I cover multiple approaches to evaluatingreinforcement-learning algorithms with batch data. …
N Jiang, L Li - International conference on machine learning, 2016 - proceedings.mlr.press
We study the problem of off-policy value evaluation in reinforcementlearning (RL), where one aims to estimate the value of a new policy based on data collected by a different policy. …
V Jayawardana, C Tang, S Li… - Advances in Neural …, 2022 - proceedings.neurips.cc
… DRL methods on select MDP instances, evaluating the … evaluating on an MDP family is nontrivial. Overall, this work identifies new challenges for empirical rigor in reinforcementlearning…
… this section we analyze some of the evaluation metrics commonly used in the reinforcement … We focus on evaluation methods for the policy optimization view (with offline evaluation), but …
… Reinforcementlearning (RL) is one of the most vibrant research frontiers in machine learning … In this paper, we primarily focus on off-policy evaluation (OPE), one of the most …
… in evaluation in RL compared to supervised learning, … evaluation in RL, and propose an evaluation pipeline that can be decoupled from the algorithm code. We hope such an evaluation …
… empirical evaluation, to act as a foundation for future comparative studies. Two classes of multiobjective reinforcementlearning algorithms are identified, and appropriate evaluation …
… learning and deep reinforcementlearning efforts [23, 35]. As OPE is central to real-world applications of reinforcementlearning… work on OPE evaluation for reinforcementlearning [17, 18…