Decision-theoretic distributed channel selection for opportunistic spectrum access: Strategies, challenges and solutions

Y Xu, A Anpalagan, Q Wu, L Shen… - … Surveys & Tutorials, 2013 - ieeexplore.ieee.org
Opportunistic spectrum access (OSA) has been regarded as the most promising approach to
solve the paradox between spectrum scarcity and waste. Intelligent decision making is key …

[图书][B] Bandit algorithms

T Lattimore, C Szepesvári - 2020 - books.google.com
Decision-making in the face of uncertainty is a significant challenge in machine learning,
and the multi-armed bandit model is a commonly used framework to address it. This …

Combinatorial multi-armed bandit: General framework and applications

W Chen, Y Wang, Y Yuan - International conference on …, 2013 - proceedings.mlr.press
We define a general framework for a large class of combinatorial multi-armed bandit (CMAB)
problems, where simple arms with unknown istributions form\em super arms. In each round …

[图书][B] Multi-armed bandit allocation indices

J Gittins, K Glazebrook, R Weber - 2011 - books.google.com
In 1989 the first edition of this book set out Gittins' pioneering index solution to the multi-
armed bandit problem and his subsequent investigation of a wide of sequential resource …

Combinatorial sleeping bandits with fairness constraints

F Li, J Liu, B Ji - IEEE Transactions on Network Science and …, 2019 - ieeexplore.ieee.org
The multi-armed bandit (MAB) model has been widely adopted for studying many practical
optimization problems (network resource allocation, ad placement, crowdsourcing, etc.) with …

Combinatorial network optimization with unknown variables: Multi-armed bandits with linear rewards and individual observations

Y Gai, B Krishnamachari, R Jain - IEEE/ACM Transactions on …, 2012 - ieeexplore.ieee.org
We formulate the following combinatorial multi-armed bandit (MAB) problem: There are N
random variables with unknown mean that are each instantiated in an iid fashion over time …

Sample mean based index policies by o (log n) regret for the multi-armed bandit problem

R Agrawal - Advances in applied probability, 1995 - cambridge.org
We consider a non-Bayesian infinite horizon version of the multi-armed bandit problem with
the objective of designing simple policies whose regret increases slowly with time. In their …

Dynamic assortment optimization with a multinomial logit choice model and capacity constraint

P Rusmevichientong, ZJM Shen… - Operations …, 2010 - pubsonline.informs.org
We consider an assortment optimization problem where a retailer chooses an assortment of
products that maximizes the profit subject to a capacity constraint. The demand is …

Distributed learning in multi-armed bandit with multiple players

K Liu, Q Zhao - IEEE transactions on signal processing, 2010 - ieeexplore.ieee.org
We formulate and study a decentralized multi-armed bandit (MAB) problem. There are M
distributed players competing for N independent arms. Each arm, when played, offers iid …

Combinatorial multi-armed bandit and its extension to probabilistically triggered arms

W Chen, Y Wang, Y Yuan, Q Wang - Journal of Machine Learning …, 2016 - jmlr.org
In the past few years, differential privacy has become a standard concept in the area of
privacy. One of the most important problems in this field is to answer queries while …