Deciding when to quit the gambler's ruin game with unknown probabilities

FS Perotto, I Trabelsi, S Combettes, V Camps… - International Journal of …, 2021 - Elsevier
In the standard definition of the classical gambler's ruin game, a persistent player enters in a
stochastic process with an initial budget b 0, which is, round after round, either increased by …

Controlling Tail Risk in Online Ski-Rental

M Dinitz, S Im, T Lavastida, B Moseley… - Proceedings of the 2024 …, 2024 - SIAM
The classical ski-rental problem admits a textbook 2-competitive deterministic algorithm, and
a simple randomized algorithm that is e/e-1-competitive in expectation. The randomized …

The Survival Bandit Problem

C Riou, J Honda, M Sugiyama - arXiv preprint arXiv:2206.03019, 2022 - arxiv.org
We study the survival bandit problem, a variant of the multi-armed bandit problem with a
constraint on the cumulative reward; at each time step, the agent receives a reward in [-1, 1] …

Gambler bandits and the regret of being ruined

FS Perotto, S Vakili, P Gajane, Y Faghan… - … on Autonomous Agents …, 2021 - hal.science
In this paper we consider a particular class of problems called multiarmed gambler bandits
(MAGB) which constitutes a modified version of the Bernoulli MAB problem where two new …

Simple Modification of the Upper Confidence Bound Algorithm by Generalized Weighted Averages

N Manome, S Shinohara, U Chung - arXiv preprint arXiv:2308.14350, 2023 - arxiv.org
The multi-armed bandit (MAB) problem is a classical problem that models sequential
decision-making under uncertainty in reinforcement learning. In this study, we propose a …

Approaching Single-Episode Survival Reinforcement Learning with Safety-Threshold Q-Learning

FS Perotto, M Nargeot, A Ouahbi - International Conference on …, 2024 - Springer
Abstract Survival Reinforcement Learning is a specific type of RL problem constrained by a
risk of ruin. The underlying stochastic sequential decision process with which the agent …

Optimizing recommendations under abandonment risks: Models and algorithms

X Wang, H Xie, P Wang, JCS Lui - Performance Evaluation, 2023 - Elsevier
User abandonment behaviors are quite common in recommendation applications such as
online shopping recommendation and news recommendation. To maximize its total “reward” …

Survival Multiarmed Bandits with Bootstrapping Methods

P Veroutis, F Godin - arXiv preprint arXiv:2410.16486, 2024 - arxiv.org
The Multiarmed Bandits (MAB) problem has been extensively studied and has seen many
practical applications in a variety of fields. The Survival Multiarmed Bandits (S-MAB) open …

Time is Budget: A Heuristic for Reducing the Risk of Ruin in Multi-armed Gambler Bandits

FS Perotto, X Pucel, JL Farges - International Conference on Innovative …, 2022 - Springer
In this paper we consider Multi-Armed Gambler Bandits (MAGB), a stochastic random
process in which an agent performs successive actions and either loses 1 unit from its …

[PDF][PDF] HAL Id: hal-03120813 https://hal. archives-ouvertes. fr/hal-03120813

FS Perotto, S Vakili, P Gajane, Y Faghan, M Bourgais - academia.edu
In this paper we consider a particular class of problems called multiarmed gambler bandits
(MAGB) which constitutes a modified version of the Bernoulli MAB problem where two new …