Asymptotically efficient allocation rules for the multiarmed bandit problem with multiple...

Decision-theoretic distributed channel selection for opportunistic spectrum access: Strategies, challenges and solutions

Y Xu, A Anpalagan, Q Wu, L Shen… - … Surveys & Tutorials, 2013 - ieeexplore.ieee.org

Opportunistic spectrum access (OSA) has been regarded as the most promising approach to
solve the paradox between spectrum scarcity and waste. Intelligent decision making is key …

被引用次数：270 相关文章所有 4 个版本

[PDF] tor-lattimore.com

[图书][B] Bandit algorithms

T Lattimore, C Szepesvári - 2020 - books.google.com

Decision-making in the face of uncertainty is a significant challenge in machine learning,
and the multi-armed bandit model is a commonly used framework to address it. This …

被引用次数：2942 相关文章所有 9 个版本

[PDF] mlr.press

Combinatorial multi-armed bandit: General framework and applications

W Chen, Y Wang, Y Yuan - International conference on …, 2013 - proceedings.mlr.press

We define a general framework for a large class of combinatorial multi-armed bandit (CMAB)
problems, where simple arms with unknown istributions form\em super arms. In each round …

被引用次数：830 相关文章所有 13 个版本

[图书][B] Multi-armed bandit allocation indices

J Gittins, K Glazebrook, R Weber - 2011 - books.google.com

In 1989 the first edition of this book set out Gittins' pioneering index solution to the multi-
armed bandit problem and his subsequent investigation of a wide of sequential resource …

被引用次数：2110 相关文章所有 5 个版本

[PDF] ieee.org

Combinatorial sleeping bandits with fairness constraints

F Li, J Liu, B Ji - IEEE Transactions on Network Science and …, 2019 - ieeexplore.ieee.org

The multi-armed bandit (MAB) model has been widely adopted for studying many practical
optimization problems (network resource allocation, ad placement, crowdsourcing, etc.) with …

被引用次数：180 相关文章所有 10 个版本

[PDF] psu.edu

Combinatorial network optimization with unknown variables: Multi-armed bandits with linear rewards and individual observations

Y Gai, B Krishnamachari, R Jain - IEEE/ACM Transactions on …, 2012 - ieeexplore.ieee.org

We formulate the following combinatorial multi-armed bandit (MAB) problem: There are N
random variables with unknown mean that are each instantiated in an iid fashion over time …

被引用次数：492 相关文章所有 12 个版本

Sample mean based index policies by o (log n) regret for the multi-armed bandit problem

R Agrawal - Advances in applied probability, 1995 - cambridge.org

We consider a non-Bayesian infinite horizon version of the multi-armed bandit problem with
the objective of designing simple policies whose regret increases slowly with time. In their …

被引用次数：773 相关文章所有 7 个版本

[PDF] researchgate.net

Dynamic assortment optimization with a multinomial logit choice model and capacity constraint

P Rusmevichientong, ZJM Shen… - Operations …, 2010 - pubsonline.informs.org

We consider an assortment optimization problem where a retailer chooses an assortment of
products that maximizes the profit subject to a capacity constraint. The demand is …

被引用次数：567 相关文章所有 16 个版本

[PDF] arxiv.org

Distributed learning in multi-armed bandit with multiple players

K Liu, Q Zhao - IEEE transactions on signal processing, 2010 - ieeexplore.ieee.org

We formulate and study a decentralized multi-armed bandit (MAB) problem. There are M
distributed players competing for N independent arms. Each arm, when played, offers iid …

被引用次数：484 相关文章所有 16 个版本

[PDF] jmlr.org

Combinatorial multi-armed bandit and its extension to probabilistically triggered arms

W Chen, Y Wang, Y Yuan, Q Wang - Journal of Machine Learning …, 2016 - jmlr.org

In the past few years, differential privacy has become a standard concept in the area of
privacy. One of the most important problems in this field is to answer queries while …

被引用次数：250 相关文章所有 10 个版本

高级搜索

QQ 群