Multi-armed bandit with sub-exponential rewards

H Jia, C Shi, S Shen - Operations Research, 2024 - pubsonline.informs.org

We consider a price-based revenue management problem with finite reusable resources
over a finite time horizon T. Customers arrive following a price-dependent Poisson process …

被引用次数：39 相关文章所有 7 个版本

[PDF] aaai.org

Modeling attrition in recommender systems with departing bandits

O Ben-Porat, L Cohen, L Leqi, ZC Lipton… - Proceedings of the AAAI …, 2022 - ojs.aaai.org

Traditionally, when recommender systems are formalized as multi-armed bandits, the policy
of the recommender system influences the rewards accrued, but not the length of interaction …

被引用次数：15 相关文章所有 7 个版本

[PDF] neurips.cc

Decentralized randomly distributed multi-agent multi-armed bandit with heterogeneous rewards

M Xu, D Klabjan - Advances in Neural Information …, 2024 - proceedings.neurips.cc

We study a decentralized multi-agent multi-armed bandit problem in which multiple clients
are connected by time dependent random graphs provided by an environment. The reward …

被引用次数：5 相关文章所有 9 个版本

[PDF] arxiv.org

A Combinatorial Semi-Bandit Approach to Charging Station Selection for Electric Vehicles

N Åkerblom, MH Chehreghani - arXiv preprint arXiv:2301.07156, 2023 - arxiv.org

In this work, we address the problem of long-distance navigation for battery electric vehicles
(BEVs), where one or more charging sessions are required to reach the intended …

被引用次数：1 相关文章所有 3 个版本

[PDF] arxiv.org

Thompson Sampling on Asymmetric -Stable Bandits

Z Shi, EE Kuruoglu, X Wei - arXiv preprint arXiv:2203.10214, 2022 - arxiv.org

In algorithm optimization in reinforcement learning, how to deal with the exploration-
exploitation dilemma is particularly important. Multi-armed bandit problem can optimize the …

被引用次数：2 相关文章所有 3 个版本

[PDF] arxiv.org

Multi-agent Multi-armed Bandit with Fully Heavy-tailed Dynamics

X Wang, M Xu - arXiv preprint arXiv:2501.19239, 2025 - arxiv.org

We study decentralized multi-agent multi-armed bandits in fully heavy-tailed settings, where
clients communicate over sparse random graphs with heavy-tailed degree distributions and …

Multi-Armed Bandit With Several Agents and Objectives

M Xu - 2024 - search.proquest.com

Multi-armed Bandit (MAB) is a classical online sequential decision-making paradigm which
has wide applications in various areas, such as healthcare, e-commerce, advertisement and …

[PDF] umich.edu

Adaptive Optimization and Learning for Service Systems

H Jia - 2022 - deepblue.lib.umich.edu

The primary focus of this dissertation is to develop adaptive optimization and learning
models and algorithms for decision-making problems under uncertainty arising in service …

[PDF] openreview.net

Long-Distance Electric Vehicle Navigation using a Combinatorial Semi-Bandit Approach

N Åkerblom, MH Chehreghani - Sixteenth European Workshop on … - openreview.net

In this work, we address the problem of long-distance navigation for battery electric vehicles
(BEVs), where one or more charging sessions are required to reach the intended …

[PDF] cmu.edu

[PDF][PDF] Human-Centered Machine Learning: A Statistical and Algorithmic Perspective

L Liu - 2023 - kilthub.cmu.edu

Building artificial intelligence systems from a human-centered perspective is increasingly
urgent, as large-scale machine learning systems ranging from personalized recommender …

高级搜索

QQ 群