Off-policy evaluation for large action spaces via policy convolution

N Sachdeva, L Wang, D Liang, N Kallus… - Proceedings of the ACM …, 2024 - dl.acm.org
Developing accurate off-policy estimators is crucial for both evaluating and optimizing for
new policies. The main challenge in off-policy estimation is the distribution shift between the …

Variational weighting for kernel density ratios

S Yoon, F Park, G Yun, I Kim… - Advances in Neural …, 2024 - proceedings.neurips.cc
Kernel density estimation (KDE) is integral to a range of generative and discriminative tasks
in machine learning. Drawing upon tools from the multidimensional calculus of variations …

Off-policy estimation with adaptively collected data: the power of online learning

J Lee, C Ma - arXiv preprint arXiv:2411.12786, 2024 - arxiv.org
We consider estimation of a linear functional of the treatment effect using adaptively
collected data. This task finds a variety of applications including the off-policy evaluation …

Kernel Metric Learning for In-Sample Off-Policy Evaluation of Deterministic RL Policies

H Lee, TW Guntara, J Lee, YK Noh, KE Kim - arXiv preprint arXiv …, 2024 - arxiv.org
We consider off-policy evaluation (OPE) of deterministic target policies for reinforcement
learning (RL) in environments with continuous action spaces. While it is common to use …