Combinatorial bandits revisited

SCH Hoi, D Sahoo, J Lu, P Zhao - Neurocomputing, 2021 - Elsevier

Online learning represents a family of machine learning methods, where a learner attempts
to tackle some predictive (or any type of decision-making) task by learning from a sequence …

被引用次数：854 相关文章所有 6 个版本

[PDF] tor-lattimore.com

[图书][B] Bandit algorithms

T Lattimore, C Szepesvári - 2020 - books.google.com

Decision-making in the face of uncertainty is a significant challenge in machine learning,
and the multi-armed bandit model is a commonly used framework to address it. This …

被引用次数：3211 相关文章所有 9 个版本

[PDF] ieee.org

Combinatorial sleeping bandits with fairness constraints

F Li, J Liu, B Ji - IEEE Transactions on Network Science and …, 2019 - ieeexplore.ieee.org

The multi-armed bandit (MAB) model has been widely adopted for studying many practical
optimization problems (network resource allocation, ad placement, crowdsourcing, etc.) with …

被引用次数：190 相关文章所有 10 个版本

[PDF] mlr.press

Thompson sampling for combinatorial semi-bandits

S Wang, W Chen - International Conference on Machine …, 2018 - proceedings.mlr.press

We study the application of the Thompson sampling (TS) methodology to the stochastic
combinatorial multi-armed bandit (CMAB) framework. We analyze the standard TS algorithm …

被引用次数：152 相关文章所有 5 个版本

Bandit algorithms: A comprehensive review and their dynamic selection from a portfolio for multicriteria top-k recommendation

A Letard, N Gutowski, O Camp, T Amghar - Expert Systems with …, 2024 - Elsevier

This paper discusses the use of portfolio approaches based on bandit algorithms to optimize
multicriteria decision-making in recommender systems (accuracy and diversity). While …

被引用次数：2 相关文章所有 2 个版本

[PDF] neurips.cc

Combinatorial multi-armed bandit with general reward functions

W Chen, W Hu, F Li, J Li, Y Liu… - Advances in Neural …, 2016 - proceedings.neurips.cc

In this paper, we study the stochastic combinatorial multi-armed bandit (CMAB) framework
that allows a general nonlinear reward function, whose expected value may not depend only …

被引用次数：158 相关文章所有 15 个版本

[PDF] neurips.cc

Minimal exploration in structured stochastic bandits

R Combes, S Magureanu… - Advances in Neural …, 2017 - proceedings.neurips.cc

This paper introduces and addresses a wide class of stochastic bandit problems where the
function mapping the arm to the corresponding reward exhibits some known structural …

被引用次数：137 相关文章所有 18 个版本

[PDF] aaai.org

Budget-constrained multi-armed bandits with multiple plays

D Zhou, C Tomlin - Proceedings of the AAAI Conference on Artificial …, 2018 - ojs.aaai.org

We study the multi-armed bandit problem with multiple plays and a budget constraint for
both the stochastic and the adversarial setting. At each round, exactly K out of N possible …

被引用次数：147 相关文章所有 12 个版本

[PDF] mlr.press

Contextual combinatorial cascading bandits

S Li, B Wang, S Zhang, W Chen - … conference on machine …, 2016 - proceedings.mlr.press

We propose the contextual combinatorial cascading bandits, a combinatorial online learning
game, where at each time step a learning agent is given a set of contextual information, then …

被引用次数：152 相关文章所有 17 个版本

Learning unknown service rates in queues: A multiarmed bandit approach

S Krishnasamy, R Sen, R Johari… - Operations …, 2021 - pubsonline.informs.org

Consider a queueing system consisting of multiple servers. Jobs arrive over time and enter a
queue for service; the goal is to minimize the size of this queue. At each opportunity for …

被引用次数：65 相关文章所有 5 个版本

高级搜索

QQ 群

Online learning: A comprehensive survey