Private reinforcement learning with pac and regret guarantees

G Vietri, B Balle, A Krishnamurthy… - … on Machine Learning, 2020 - proceedings.mlr.press
Motivated by high-stakes decision-making domains like personalized medicine where user
information is inherently sensitive, we design privacy preserving exploration policies for …

Optimal Learning Policies for Differential Privacy in Multi-armed Bandits

S Wang, J Zhu - Journal of Machine Learning Research, 2024 - jmlr.org
This paper studies the multi-armed bandit problem with a requirement of differential privacy
guarantee or global differential privacy guarantee. We first prove that, the lower bound for …