Near-optimal differentially private reinforcement learning

F Zhao, D Qiao, R Redberg… - Advances in …, 2022 - proceedings.neurips.cc

Linear sketches have been widely adopted to process fast data streams, and they can be
used to accurately answer frequency estimation, approximate top K items, and summarize …

被引用次数：22 相关文章所有 6 个版本

[PDF] neurips.cc

Offline reinforcement learning with differential privacy

D Qiao, YX Wang - Advances in Neural Information …, 2024 - proceedings.neurips.cc

The offline reinforcement learning (RL) problem is often motivated by the need to learn data-
driven decision policies in financial, legal and healthcare applications. However, the learned …

被引用次数：14 相关文章所有 7 个版本

[PDF] mlr.press

Doubly fair dynamic pricing

J Xu, D Qiao, YX Wang - International Conference on …, 2023 - proceedings.mlr.press

We study the problem of online dynamic pricing with two types of fairness constraints: a
“procedural fairness” which requires the “proposed” prices to be equal in expectation among …

被引用次数：7 相关文章所有 5 个版本

[PDF] arxiv.org

Privacy Preserving Reinforcement Learning for Population Processes

S Yang-Zhao, KS Ng - arXiv preprint arXiv:2406.17649, 2024 - arxiv.org

We consider the problem of privacy protection in Reinforcement Learning (RL) algorithms
that operate over population processes, a practical but understudied setting that includes, for …

被引用次数：1 相关文章所有 2 个版本

[PDF] mlr.press

Differentially private episodic reinforcement learning with heavy-tailed rewards

Y Wu, X Zhou, SR Chowdhury… - … Conference on Machine …, 2023 - proceedings.mlr.press

In this paper we study the problem of (finite horizon tabular) Markov decision processes
(MDPs) with heavy-tailed rewards under the constraint of differential privacy (DP) …

Differentially Private No-regret Exploration in Adversarial Markov Decision Processes

S Bai, L Zeng, C Zhao, X Duan, MS Talebi… - The 40th Conference on … - openreview.net

We study learning adversarial Markov decision process (MDP) in the episodic setting under
the constraint of differential privacy (DP). This is motivated by the widespread applications of …

[PDF] openreview.net

Follow-the-Perturbed-Leader for Adversarial Bandits: Heavy Tails, Robustness, and Privacy

D Cheng, X Zhou, B Ji - openreview.net

We study adversarial bandit problems with potentially heavy-tailed losses. Unlike standard
settings with non-negative and bounded losses, managing negative and unbounded losses …

高级搜索

QQ 群