Online learning: A comprehensive survey

SCH Hoi, D Sahoo, J Lu, P Zhao - Neurocomputing, 2021 - Elsevier
Online learning represents a family of machine learning methods, where a learner attempts
to tackle some predictive (or any type of decision-making) task by learning from a sequence …

[图书][B] Bandit algorithms

T Lattimore, C Szepesvári - 2020 - books.google.com
Decision-making in the face of uncertainty is a significant challenge in machine learning,
and the multi-armed bandit model is a commonly used framework to address it. This …

Combinatorial sleeping bandits with fairness constraints

F Li, J Liu, B Ji - IEEE Transactions on Network Science and …, 2019 - ieeexplore.ieee.org
The multi-armed bandit (MAB) model has been widely adopted for studying many practical
optimization problems (network resource allocation, ad placement, crowdsourcing, etc.) with …

Thompson sampling for combinatorial semi-bandits

S Wang, W Chen - International Conference on Machine …, 2018 - proceedings.mlr.press
We study the application of the Thompson sampling (TS) methodology to the stochastic
combinatorial multi-armed bandit (CMAB) framework. We analyze the standard TS algorithm …

Bandit algorithms: A comprehensive review and their dynamic selection from a portfolio for multicriteria top-k recommendation

A Letard, N Gutowski, O Camp, T Amghar - Expert Systems with …, 2024 - Elsevier
This paper discusses the use of portfolio approaches based on bandit algorithms to optimize
multicriteria decision-making in recommender systems (accuracy and diversity). While …

Combinatorial multi-armed bandit with general reward functions

W Chen, W Hu, F Li, J Li, Y Liu… - Advances in Neural …, 2016 - proceedings.neurips.cc
In this paper, we study the stochastic combinatorial multi-armed bandit (CMAB) framework
that allows a general nonlinear reward function, whose expected value may not depend only …

Minimal exploration in structured stochastic bandits

R Combes, S Magureanu… - Advances in Neural …, 2017 - proceedings.neurips.cc
This paper introduces and addresses a wide class of stochastic bandit problems where the
function mapping the arm to the corresponding reward exhibits some known structural …

Budget-constrained multi-armed bandits with multiple plays

D Zhou, C Tomlin - Proceedings of the AAAI Conference on Artificial …, 2018 - ojs.aaai.org
We study the multi-armed bandit problem with multiple plays and a budget constraint for
both the stochastic and the adversarial setting. At each round, exactly K out of N possible …

Contextual combinatorial cascading bandits

S Li, B Wang, S Zhang, W Chen - … conference on machine …, 2016 - proceedings.mlr.press
We propose the contextual combinatorial cascading bandits, a combinatorial online learning
game, where at each time step a learning agent is given a set of contextual information, then …

Learning unknown service rates in queues: A multiarmed bandit approach

S Krishnasamy, R Sen, R Johari… - Operations …, 2021 - pubsonline.informs.org
Consider a queueing system consisting of multiple servers. Jobs arrive over time and enter a
queue for service; the goal is to minimize the size of this queue. At each opportunity for …