Risk-constrained thompson sampling for cvar bandits

JQL Chang, Q Zhu, VYF Tan - arXiv preprint arXiv:2011.08046, 2020 - arxiv.org
The multi-armed bandit (MAB) problem is a ubiquitous decision-making problem that
exemplifies the exploration-exploitation tradeoff. Standard formulations exclude risk in …

Almost optimal variance-constrained best arm identification

Y Hou, VYF Tan, Z Zhong - IEEE Transactions on Information …, 2022 - ieeexplore.ieee.org
We design and analyze Variance-Aware-Lower and Upper Confidence Bound (VA-LUCB), a
parameter-free algorithm, for identifying the best arm under the fixed-confidence setup and …

Enabling An Informed Contextual Multi-Armed Bandit Framework For Stock Trading With Neuroevolution

D Kar, Z Lyu, AG Ororbia, T Desell, D Krutz - Proceedings of the Genetic …, 2024 - dl.acm.org
Multi-armed bandits and contextual multi-armed bandits have demonstrated their proficiency
in a variety of application areas. However, these models are highly susceptible to volatility …

Risk averse non-stationary multi-armed bandits

L Benac, F Godin - arXiv preprint arXiv:2109.13977, 2021 - arxiv.org
This paper tackles the risk averse multi-armed bandits problem when incurred losses are
non-stationary. The conditional value-at-risk (CVaR) is used as the objective function. Two …

Online Resource Allocation and its Applications

Q Zhu - 2022 - search.proquest.com
Online Resource Allocation and its Applications Page 1 ONLINE RESOURCE
ALLOCATION AND ITS APPLICATIONS by QIUYU ZHU (BS, University of Science and …