Multi-agent best arm identification with private communications

A Rio, M Barlier, I Colin… - … Conference on Machine …, 2023 - proceedings.mlr.press
We address multi-agent best arm identification with privacy guarantees. In this setting,
agents collaborate by communicating to find the optimal arm. To avoid leaking sensitive data …

Conservative exploration in reinforcement learning

E Garcelon, M Ghavamzadeh… - International …, 2020 - proceedings.mlr.press
While learning in an unknown Markov Decision Process (MDP), an agent should trade off
exploration to discover new information about the MDP, and exploitation of the current …

SAMBA: a generic framework for secure federated multi-armed bandits

R Ciucanu, P Lafourcade, G Marcadet… - Journal of Artificial …, 2022 - jair.org
The multi-armed bandit is a reinforcement learning model where a learning agent
repeatedly chooses an action (pull a bandit arm) and the environment responds with a …

Homomorphic Encrypted Revenue Management

M Abdolmaleki, R Momot - Available at SSRN 4724820, 2024 - papers.ssrn.com
We develop a novel homomorphic encryption-based approach to privacy preservation in a
dynamic personalized pricing setting. In each period, the firm offers a personalized price to …