Quantile off-policy evaluation via deep conditional generative learning

Y Xu, C Shi, S Luo, L Wang, R Song - arXiv preprint arXiv:2212.14466, 2022 - arxiv.org
Off-Policy evaluation (OPE) is concerned with evaluating a new target policy using offline
data generated by a potentially different behavior policy. It is critical in a number of …

Quantile Off-Policy Evaluation via Deep Conditional Generative Learning

Y Xu, C Shi, S Luo, L Wang, R Song - arXiv e-prints, 2022 - ui.adsabs.harvard.edu
Off-Policy evaluation (OPE) is concerned with evaluating a new target policy using offline
data generated by a potentially different behavior policy. It is critical in a number of …