所有版本 - 学术资源搜索

Coindice: Off-policy confidence interval estimation

B Dai, O Nachum, Y Chow, L Li… - Advances in neural …, 2020 - proceedings.neurips.cc

We study high-confidence behavior-agnostic off-policy evaluation in reinforcement learning,
where the goal is to estimate a confidence interval on a target policy's value, given only …

被引用次数：85 相关文章

[PDF] ualberta.ca

[PDF][PDF] CoinDICE: Off-Policy Confidence Interval Estimation

B Dai, O Nachum, Y Chow, L Li, C Szepesvári… - ualberta.ca

We study high-confidence behavior-agnostic off-policy evaluation in reinforcement learning,
where the goal is to estimate a confidence interval on a target policy's value, given only …

[PDF] ualberta.ca

[PDF][PDF] CoinDICE: Off-Policy Confidence Interval Estimation

B Dai, O Nachum, Y Chow, L Li, C Szepesvári… - sites.ualberta.ca

We study high-confidence behavior-agnostic off-policy evaluation in reinforcement learning,
where the goal is to estimate a confidence interval on a target policy's value, given only …

CoinDICE: off-policy confidence interval estimation

B Dai, O Nachum, Y Chow, L Li, C Szepesvári… - Proceedings of the 34th …, 2020 - dl.acm.org

We study high-confidence behavior-agnostic off-policy evaluation in reinforcement learning,
where the goal is to estimate a confidence interval on a target policy's value, given only …

CoinDICE: Off-Policy Confidence Interval Estimation

B Dai, O Nachum, Y Chow, L Li, C Szepesvári… - arXiv e …, 2020 - ui.adsabs.harvard.edu

We study high-confidence behavior-agnostic off-policy evaluation in reinforcement learning,
where the goal is to estimate a confidence interval on a target policy's value, given only …

[PDF] ualberta.ca

[PDF][PDF] CoinDICE: Off-Policy Confidence Interval Estimation

B Dai, O Nachum, Y Chow, L Li, C Szepesvári… - webdocs.cs.ualberta.ca

We study high-confidence behavior-agnostic off-policy evaluation in reinforcement learning,
where the goal is to estimate a confidence interval on a target policy's value, given only …

[PDF] berkeley.edu

[PDF][PDF] CoinDICE: Off-policy Confidence Interval Estimation

B Dai - simons.berkeley.edu

CoinDICE: Off-policy Confidence Interval Estimation Page 1 CoinDICE: Off-policy Confidence
Interval Estimation Bo Dai Google Research, Brain Team joint work with Ofir Nachum, Yinlam …

高级搜索

QQ 群

Coindice: Off-policy confidence interval estimation

[PDF][PDF] CoinDICE: Off-Policy Confidence Interval Estimation

[PDF][PDF] CoinDICE: Off-Policy Confidence Interval Estimation

CoinDICE: off-policy confidence interval estimation

CoinDICE: Off-Policy Confidence Interval Estimation

[PDF][PDF] CoinDICE: Off-Policy Confidence Interval Estimation

CoinDICE: Off-Policy Confidence Interval Estimation

CoinDICE: Off-Policy Confidence Interval Estimation

CoinDICE: Off-Policy Confidence Interval Estimation

[PDF][PDF] CoinDICE: Off-policy Confidence Interval Estimation

引用