Deeply-debiased off-policy interval estimation

C Shi, R Wan, V Chernozhukov… - … conference on machine …, 2021 - proceedings.mlr.press
Off-policy evaluation learns a target policy's value with a historical dataset generated by a
different behavior policy. In addition to a point estimate, many applications would benefit …

Deeply-Debiased Off-Policy Interval Estimation

C Shi, R Wan, V Chernozhukov, R Song - arXiv e-prints, 2021 - ui.adsabs.harvard.edu
Off-policy evaluation learns a target policy's value with a historical dataset generated by a
different behavior policy. In addition to a point estimate, many applications would benefit …

[PDF][PDF] Deeply-Debiased Off-Policy Interval Estimation

C Shi, R Wan, V Chernozhukov, R Song - proceedings.mlr.press
Off-policy evaluation learns a target policy's value with a historical dataset generated by a
different behavior policy. In addition to a point estimate, many applications would benefit …

Deeply-Debiased Off-Policy Interval Estimation

C Shi, R Wan, V Chernozhukov, R Song - arXiv preprint arXiv:2105.04646, 2021 - arxiv.org
Off-policy evaluation learns a target policy's value with a historical dataset generated by a
different behavior policy. In addition to a point estimate, many applications would benefit …

Deeply-debiased off-policy interval estimation

C Shi, R Wan, V Chernozhukov, R Song - 2021 - eprints.lse.ac.uk
Off-policy evaluation learns a target policy's value with a historical dataset generated by a
different behavior policy. In addition to a point estimate, many applications would benefit …