- 学术资源搜索

[图书][B] Bandit algorithms

T Lattimore, C Szepesvári - 2020 - books.google.com

Decision-making in the face of uncertainty is a significant challenge in machine learning,
and the multi-armed bandit model is a commonly used framework to address it. This …

被引用次数：3274 相关文章所有 9 个版本

[PDF] arxiv.org

A survey of online experiment design with the stochastic multi-armed bandit

G Burtini, J Loeppky, R Lawrence - arXiv preprint arXiv:1510.00757, 2015 - arxiv.org

Adaptive and sequential experiment design is a well-studied area in numerous domains. We
survey and synthesize the work of the online statistical learning paradigm referred to as multi …

被引用次数：165 相关文章所有 3 个版本

[PDF] sagepub.com

Recommendation system for adaptive learning

Y Chen, X Li, J Liu, Z Ying - Applied psychological …, 2018 - journals.sagepub.com

An adaptive learning system aims at providing instruction tailored to the current status of a
learner, differing from the traditional classroom experience. The latest advances in …

被引用次数：131 相关文章所有 11 个版本

[PDF] springer.com

Reinforcement learning for sequential decision making in population research

N Deliu - Quality & Quantity, 2024 - Springer

Reinforcement learning (RL) algorithms have been long recognized as powerful tools for
optimal sequential decision making. The framework is concerned with a decision maker, the …

被引用次数：11 相关文章所有 4 个版本

[PDF] acm.org

The Gittins policy is nearly optimal in the M/G/k under extremely general conditions

Z Scully, I Grosof, M Harchol-Balter - … of the ACM on Measurement and …, 2020 - dl.acm.org

The Gittins scheduling policy minimizes the mean response in the single-server M/G/1
queue in a wide variety of settings. Most famously, Gittins is optimal when preemption is …

被引用次数：36 相关文章所有 5 个版本

[图书][B] Multi-armed bandits: Theory and applications to online learning in networks

Q Zhao - 2019 - books.google.com

Multi-armed bandit problems pertain to optimal sequential decision making and learning in
unknown environments. Since the first bandit problem posed by Thompson in 1933 for the …

被引用次数：47 相关文章所有 4 个版本

[PDF] arxiv.org

The assistive multi-armed bandit

L Chan, D Hadfield-Menell, S Srinivasa… - 2019 14th ACM/IEEE …, 2019 - ieeexplore.ieee.org

Learning preferences implicit in the choices humans make is a well studied problem in both
economics and computer science. However, most work makes the assumption that humans …

被引用次数：55 相关文章所有 6 个版本

[PDF] cmu.edu

A new toolbox for scheduling theory

Z Scully - ACM SIGMETRICS Performance Evaluation Review, 2023 - dl.acm.org

Queueing delays are ubiquitous in many domains, including computer systems, service
systems, communication networks, supply chains, and transportation. Queueing and …

被引用次数：14 相关文章所有 11 个版本

[PDF] mcgill.ca

Conditions for indexability of restless bandits and an algorithm to compute Whittle index

N Akbarzadeh, A Mahajan - Advances in Applied Probability, 2022 - cambridge.org

Restless bandits are a class of sequential resource allocation problems concerned with
allocating one or more resources among several alternative processes where the evolution …

被引用次数：26 相关文章所有 8 个版本

[PDF] neurips.cc

Multi-armed bandits with bounded arm-memory: Near-optimal guarantees for best-arm identification and regret minimization

A Maiti, V Patil, A Khan - Advances in Neural Information …, 2021 - proceedings.neurips.cc

Abstract We study the Stochastic Multi-armed Bandit problem under bounded arm-memory.
In this setting, the arms arrive in a stream, and the number of arms that can be stored in the …

被引用次数：16 相关文章所有 7 个版本

高级搜索

QQ 群