Because logged data has become ubiquitous in wide-range applications and since onlineexploration may be sensitive, counterfactual methods have gained significant …
Offline policy learning methods are intended to learn a policy from logged data, which includes context, action, and reward for each sample point. In this work we build on the …
Counterfactual Estimation from Logged Data Page 1 Counterfactual Estimation from Logged Data Raphaël Féraud ORANGE Innovation March 2023 Raphaël Féraud (Orange Innovation) …