Data-efficient policy evaluation through behavior policy search

文章

学术资源搜索

获得 2 条结果（用时0.02秒）

我的图书馆

Data-efficient policy evaluation through behavior policy search

在引用文章中搜索

[PDF] mlr.press

Revar: Strengthening policy evaluation via reduced variance sampling

S Mukherjee, JP Hanna… - Uncertainty in Artificial …, 2022 - proceedings.mlr.press

This paper studies the problem of data collection for policy evaluation in Markov decision
processes (MDPs). In policy evaluation, we are given a\textit {target} policy and asked to …

被引用次数：10 相关文章所有 9 个版本

[PDF] neurips.cc

Robust on-policy sampling for data-efficient policy evaluation in reinforcement learning

R Zhong, D Zhang, L Schäfer… - Advances in Neural …, 2022 - proceedings.neurips.cc

Reinforcement learning (RL) algorithms are often categorized as either on-policy or off-
policy depending on whether they use data from a target policy of interest or from a different …

被引用次数：8 相关文章所有 8 个版本

高级搜索

QQ 群

Data-efficient policy evaluation through behavior policy search

Revar: Strengthening policy evaluation via reduced variance sampling

Robust on-policy sampling for data-efficient policy evaluation in reinforcement learning

引用