C Qin, D Russo - arXiv preprint arXiv:2202.09036, 2022 - aeaweb.org
We explore a new model of bandit experiments where a potentially nonstationary sequence of contexts influences arms' performance. Context-unaware algorithms risk confounding …
R Degenne - The Thirty Sixth Annual Conference on …, 2023 - proceedings.mlr.press
In fixed budget bandit identification, an algorithm sequentially observes samples from several distributions up to a given final time. It then answers a query about the set of …
M Jourdan, R Degenne - Advances in Neural Information …, 2024 - proceedings.neurips.cc
A Top Two sampling rule for bandit identification is a method which selects the next arm to sample from among two candidate arms, a leader and a challenger. Due to their simplicity …
We study the problem of best-arm identification with fixed budget in stochastic two-arm bandits with Bernoulli rewards. We prove that surprisingly, there is no algorithm that (i) …
T Kitagawa, J Rowley - The Japanese Economic Review, 2024 - Springer
Static supervised learning—in which experimental data serves as a training sample for the estimation of an optimal treatment assignment policy—is a commonly assumed framework of …
M Kato - arXiv preprint arXiv:2312.12741, 2023 - arxiv.org
We address the problem of best arm identification (BAI) with a fixed budget for two-armed Gaussian bandits. In BAI, given multiple arms, we aim to find the best arm, an arm with the …
C Qin, D Russo - arXiv preprint arXiv:2402.10592, 2024 - arxiv.org
Practitioners conducting adaptive experiments often encounter two competing priorities: reducing the cost of experimentation by effectively assigning treatments during the …
Economic policies often involve dynamic interventions, where individuals receive repeated interventions over multiple periods. This dynamics makes past responses informative to …
We study the best-arm identification (BAI) problem with a fixed budget and contextual (covariate) information. In each round of an adaptive experiment, after observing contextual …