Solving non-stationary bandit problems by random sampling from sibling kalman filters

M Speekenbrink, E Konstantinidis - Topics in cognitive science, 2015 - Wiley Online Library

Decision making in noisy and changing environments requires a fine balance between
exploiting knowledge about good courses of action and exploring the environment in order …

被引用次数：180 相关文章所有 12 个版本

[PDF] arxiv.org

A survey of online experiment design with the stochastic multi-armed bandit

G Burtini, J Loeppky, R Lawrence - arXiv preprint arXiv:1510.00757, 2015 - arxiv.org

Adaptive and sequential experiment design is a well-studied area in numerous domains. We
survey and synthesize the work of the online statistical learning paradigm referred to as multi …

被引用次数：150 相关文章所有 3 个版本

[PDF] jair.org

Sliding-window thompson sampling for non-stationary settings

F Trovo, S Paladino, M Restelli, N Gatti - Journal of Artificial Intelligence …, 2020 - jair.org

Abstract Multi-Armed Bandit (MAB) techniques have been successfully applied to many
classes of sequential decision problems in the past decades. However, non-stationary …

被引用次数：66 相关文章所有 7 个版本

[PDF] github.io

Thompson sampling for dynamic multi-armed bandits

N Gupta, OC Granmo… - 2011 10th International …, 2011 - ieeexplore.ieee.org

The importance of multi-armed bandit (MAB) problems is on the rise due to their recent
application in a large variety of areas such as online advertising, news article selection …

被引用次数：109 相关文章所有 5 个版本

[PDF] yale.edu

Adaptive experimental design: Prospects and applications in political science

M Offer‐Westort, A Coppock… - American Journal of …, 2021 - Wiley Online Library

Experimental researchers in political science frequently face the problem of inferring which
of several treatment arms is most effective. They may also seek to estimate mean outcomes …

被引用次数：50 相关文章所有 8 个版本

[PDF] arxiv.org

Motion planning as online learning: A multi-armed bandit approach to kinodynamic sampling-based planning

M Faroni, D Berenson - IEEE Robotics and Automation Letters, 2023 - ieeexplore.ieee.org

Kinodynamic motion planners allow robots to perform complex manipulation tasks under
dynamics constraints or with black-box models. However, they struggle to find high-quality …

被引用次数：4 相关文章所有 6 个版本

[PDF] ieee.org

Estimating model utility for deformable object manipulation using multiarmed bandit methods

D McConachie, D Berenson - IEEE Transactions on …, 2018 - ieeexplore.ieee.org

We present a novel approach to deformable object manipulation that does not rely on highly
accurate modeling. The key contribution of this paper is to formulate the task as a …

被引用次数：36 相关文章所有 2 个版本

[PDF] oslomet.no

The hierarchical continuous pursuit learning automation: a novel scheme for environments with large numbers of actions

A Yazidi, X Zhang, L Jiao… - IEEE transactions on …, 2019 - ieeexplore.ieee.org

Although the field of learning automata (LA) has made significant progress in the past four
decades, the LA-based methods to tackle problems involving environments with a large …

被引用次数：20 相关文章所有 5 个版本

[HTML] sciencedirect.com

[HTML][HTML] Multi-armed bandit problem with online clustering as side information

A Dzhoha, I Rozora - Journal of Computational and Applied Mathematics, 2023 - Elsevier

We consider the sequential resource allocation problem under the multi-armed bandit model
in the non-stationary stochastic environment. Motivated by many real applications, where …

被引用次数：3 相关文章所有 2 个版本

[HTML] mdpi.com

[HTML][HTML] Adversarial autoencoder and multi-armed bandit for dynamic difficulty adjustment in immersive virtual reality for rehabilitation: Application to hand movement

K Kamikokuryo, T Haga, G Venture, V Hernandez - Sensors, 2022 - mdpi.com

Motor rehabilitation is used to improve motor control skills to improve the patient's quality of
life. Regular adjustments based on the effect of therapy are necessary, but this can be time …

被引用次数：7 相关文章所有 8 个版本

高级搜索

QQ 群