- 学术资源搜索

文章

学术资源搜索

获得 4 条结果（用时0.03秒）

Multi-agent best arm identification with private communications

A Rio, M Barlier, I Colin… - … Conference on Machine …, 2023 - proceedings.mlr.press

We address multi-agent best arm identification with privacy guarantees. In this setting,
agents collaborate by communicating to find the optimal arm. To avoid leaking sensitive data …

被引用次数：5 相关文章所有 6 个版本

[PDF] mlr.press

Conservative exploration in reinforcement learning

E Garcelon, M Ghavamzadeh… - International …, 2020 - proceedings.mlr.press

While learning in an unknown Markov Decision Process (MDP), an agent should trade off
exploration to discover new information about the MDP, and exploitation of the current …

被引用次数：31 相关文章所有 11 个版本

[PDF] jair.org Full View

SAMBA: a generic framework for secure federated multi-armed bandits

R Ciucanu, P Lafourcade, G Marcadet… - Journal of Artificial …, 2022 - jair.org

The multi-armed bandit is a reinforcement learning model where a learning agent
repeatedly chooses an action (pull a bandit arm) and the environment responds with a …

被引用次数：9 相关文章所有 14 个版本

[PDF] ssrn.com

Homomorphic Encrypted Revenue Management

M Abdolmaleki, R Momot - Available at SSRN 4724820, 2024 - papers.ssrn.com

We develop a novel homomorphic encryption-based approach to privacy preservation in a
dynamic personalized pricing setting. In each period, the firm offers a personalized price to …

高级搜索

QQ 群

Multi-agent best arm identification with private communications

Conservative exploration in reinforcement learning

SAMBA: a generic framework for secure federated multi-armed bandits

Homomorphic Encrypted Revenue Management

引用