[图书][B] Bandit algorithms

T Lattimore, C Szepesvári - 2020 - books.google.com
Decision-making in the face of uncertainty is a significant challenge in machine learning,
and the multi-armed bandit model is a commonly used framework to address it. This …

Parallelised Bayesian optimisation via Thompson sampling

K Kandasamy, A Krishnamurthy… - International …, 2018 - proceedings.mlr.press
We design and analyse variations of the classical Thompson sampling (TS) procedure for
Bayesian optimisation (BO) in settings where function evaluations are expensive but can be …

Delayed gradient averaging: Tolerate the communication latency for federated learning

L Zhu, H Lin, Y Lu, Y Lin, S Han - Advances in Neural …, 2021 - proceedings.neurips.cc
Federated Learning is an emerging direction in distributed machine learning that en-ables
jointly training a model without sharing the data. Since the data is distributed across many …

[PDF][PDF] Parallelizing exploration-exploitation tradeoffs in Gaussian process bandit optimization.

T Desautels, A Krause, JW Burdick - J. Mach. Learn. Res., 2014 - jmlr.org
How can we take advantage of opportunities for experimental parallelization in
explorationexploitation tradeoffs? In many experimental scenarios, it is often desirable to …

Learning-aided computation offloading for trusted collaborative mobile edge computing

Y Li, X Wang, X Gan, H Jin, L Fu… - IEEE Transactions on …, 2019 - ieeexplore.ieee.org
Cooperative offloading in mobile edge computing enables resource-constrained edge
clouds to help each other with computation-intensive tasks. However, the power of such …

Federated bandit: A gossiping approach

Z Zhu, J Zhu, J Liu, Y Liu - Proceedings of the ACM on Measurement …, 2021 - dl.acm.org
In this paper, we study Federated Bandit, a decentralized Multi-Armed Bandit problem with a
set of N agents, who can only communicate their local data with neighbors described by a …

Machine learning for fraud detection in e-Commerce: A research agenda

N Tax, KJ de Vries, M de Jong, N Dosoula… - … Machine Learning for …, 2021 - Springer
Fraud detection and prevention play an important part in ensuring the sustained operation of
any e-commerce business. Machine learning (ML) often plays an important role in these anti …

Learning in generalized linear contextual bandits with stochastic delays

Z Zhou, R Xu, J Blanchet - Advances in Neural Information …, 2019 - proceedings.neurips.cc
In this paper, we consider online learning in generalized linear contextual bandits where
rewards are not immediately observed. Instead, rewards are available to the decision maker …

Decentralized cooperative stochastic bandits

D Martínez-Rubio, V Kanade… - Advances in Neural …, 2019 - proceedings.neurips.cc
We study a decentralized cooperative stochastic multi-armed bandit problem with K arms on
a network of N agents. In our model, the reward distribution of each arm is the same for each …

Bandits with delayed, aggregated anonymous feedback

C Pike-Burke, S Agrawal… - International …, 2018 - proceedings.mlr.press
We study a variant of the stochastic $ K $-armed bandit problem, which we call" bandits with
delayed, aggregated anonymous feedback”. In this problem, when the player pulls an arm, a …