Deciding when to quit the gambler's ruin game with unknown probabilities

FS Perotto, I Trabelsi, S Combettes, V Camps… - International Journal of …, 2021 - Elsevier
In the standard definition of the classical gambler's ruin game, a persistent player enters in a
stochastic process with an initial budget b 0, which is, round after round, either increased by …

Modeling Risk in Reinforcement Learning: A Literature Mapping

L Villalobos-Arias, D Martin, A Krishnan… - arXiv preprint arXiv …, 2023 - arxiv.org
Safe reinforcement learning deals with mitigating or avoiding unsafe situations by
reinforcement learning (RL) agents. Safe RL approaches are based on specific risk …

Approaching Single-Episode Survival Reinforcement Learning with Safety-Threshold Q-Learning

FS Perotto, M Nargeot, A Ouahbi - International Conference on …, 2024 - hal.science
Survival Reinforcement Learning is a specific type of RL problem constrained by a risk of
ruin. The underlying stochastic sequential decision process with which the agent interacts …

Survival Multiarmed Bandits with Boostrapping Methods

P Veroutis, F Godin - arXiv preprint arXiv:2410.16486, 2024 - arxiv.org
The Multiarmed Bandits (MAB) problem has been extensively studied and has seen many
practical applications in a variety of fields. The Survival Multiarmed Bandits (S-MAB) open …

Time is Budget: A Heuristic for Reducing the Risk of Ruin in Multi-armed Gambler Bandits

FS Perotto, X Pucel, JL Farges - International Conference on Innovative …, 2022 - Springer
In this paper we consider Multi-Armed Gambler Bandits (MAGB), a stochastic random
process in which an agent performs successive actions and either loses 1 unit from its …

[引用][C] Manos Theodosis Harvard University etheodosis@ seas. harvard. edu

M Guo