Online learning under delayed feedback

[图书][B] Bandit algorithms

T Lattimore, C Szepesvári - 2020 - books.google.com

Decision-making in the face of uncertainty is a significant challenge in machine learning,
and the multi-armed bandit model is a commonly used framework to address it. This …

被引用次数：3267 相关文章所有 9 个版本

[PDF] mlr.press

Parallelised Bayesian optimisation via Thompson sampling

K Kandasamy, A Krishnamurthy… - International …, 2018 - proceedings.mlr.press

We design and analyse variations of the classical Thompson sampling (TS) procedure for
Bayesian optimisation (BO) in settings where function evaluations are expensive but can be …

被引用次数：306 相关文章所有 5 个版本

[PDF] neurips.cc

Delayed gradient averaging: Tolerate the communication latency for federated learning

L Zhu, H Lin, Y Lu, Y Lin, S Han - Advances in Neural …, 2021 - proceedings.neurips.cc

Federated Learning is an emerging direction in distributed machine learning that en-ables
jointly training a model without sharing the data. Since the data is distributed across many …

被引用次数：72 相关文章所有 7 个版本

[PDF] jmlr.org

[PDF][PDF] Parallelizing exploration-exploitation tradeoffs in Gaussian process bandit optimization.

T Desautels, A Krause, JW Burdick - J. Mach. Learn. Res., 2014 - jmlr.org

How can we take advantage of opportunities for experimental parallelization in
explorationexploitation tradeoffs? In many experimental scenarios, it is often desirable to …

被引用次数：503 相关文章所有 23 个版本

[PDF] github.io

Learning-aided computation offloading for trusted collaborative mobile edge computing

Y Li, X Wang, X Gan, H Jin, L Fu… - IEEE Transactions on …, 2019 - ieeexplore.ieee.org

Cooperative offloading in mobile edge computing enables resource-constrained edge
clouds to help each other with computation-intensive tasks. However, the power of such …

被引用次数：152 相关文章所有 3 个版本

[PDF] acm.org

Federated bandit: A gossiping approach

Z Zhu, J Zhu, J Liu, Y Liu - Proceedings of the ACM on Measurement …, 2021 - dl.acm.org

In this paper, we study Federated Bandit, a decentralized Multi-Armed Bandit problem with a
set of N agents, who can only communicate their local data with neighbors described by a …

被引用次数：95 相关文章所有 8 个版本

[PDF] arxiv.org

Machine learning for fraud detection in e-Commerce: A research agenda

N Tax, KJ de Vries, M de Jong, N Dosoula… - … Machine Learning for …, 2021 - Springer

Fraud detection and prevention play an important part in ensuring the sustained operation of
any e-commerce business. Machine learning (ML) often plays an important role in these anti …

被引用次数：22 相关文章所有 4 个版本

[PDF] neurips.cc

Learning in generalized linear contextual bandits with stochastic delays

Z Zhou, R Xu, J Blanchet - Advances in Neural Information …, 2019 - proceedings.neurips.cc

In this paper, we consider online learning in generalized linear contextual bandits where
rewards are not immediately observed. Instead, rewards are available to the decision maker …

被引用次数：107 相关文章所有 7 个版本

[PDF] neurips.cc

Decentralized cooperative stochastic bandits

D Martínez-Rubio, V Kanade… - Advances in Neural …, 2019 - proceedings.neurips.cc

We study a decentralized cooperative stochastic multi-armed bandit problem with K arms on
a network of N agents. In our model, the reward distribution of each arm is the same for each …

被引用次数：123 相关文章所有 8 个版本

[PDF] mlr.press

Bandits with delayed, aggregated anonymous feedback

C Pike-Burke, S Agrawal… - International …, 2018 - proceedings.mlr.press

We study a variant of the stochastic $ K $-armed bandit problem, which we call" bandits with
delayed, aggregated anonymous feedback”. In this problem, when the player pulls an arm, a …

被引用次数：139 相关文章所有 9 个版本

高级搜索

QQ 群