Multi-armed bandit problem with temporally-partitioned rewards: When partial feedback counts

文章

学术资源搜索

获得 5 条结果（用时0.03秒）

我的图书馆

Multi-armed bandit problem with temporally-partitioned rewards: When partial feedback counts

在引用文章中搜索

[PDF] arxiv.org

Delays in reinforcement learning

P Liotet - arXiv preprint arXiv:2309.11096, 2023 - arxiv.org

Delays are inherent to most dynamical systems. Besides shifting the process in time, they
can significantly affect their performance. For this reason, it is usually valuable to study the …

被引用次数：4 相关文章所有 4 个版本

[PDF] arxiv.org

Multi-Armed Bandits with Generalized Temporally-Partitioned Rewards

RC Broek, R Litjens, T Sagis, N Verbeeke… - … Symposium on Intelligent …, 2024 - Springer

Decision-making problems of sequential nature, where decisions made in the past may
have an impact on the future, are used to model many practically important applications. In …

被引用次数：1 相关文章所有 4 个版本

[PDF] arxiv.org

Generalizing distribution of partial rewards for multi-armed bandits with temporally-partitioned rewards

RC Broek, R Litjens, T Sagis, L Siecker… - arXiv preprint arXiv …, 2022 - arxiv.org

We investigate the Multi-Armed Bandit problem with Temporally-Partitioned Rewards (TP-
MAB) setting in this paper. In the TP-MAB setting, an agent will receive subsets of the reward …

[PDF] polimi.it

Pricing and advertising strategies in e-commerce scenarios

G Romano - 2022 - politesi.polimi.it

This thesis revolves around the problem of selling and advertising products on the Web and
exploits techniques from the fields of algorithmic game theory, mechanism design, and …

Stochastic linear bandits with global-local structure

FF Gonzales - 2021 - politesi.polimi.it

This work pertains to the field of Multi-Armed-Bandits (MAB), a framework in online learning
where an agent sequentially chooses from a set of available actions, called arms, and …

高级搜索

QQ 群

Multi-armed bandit problem with temporally-partitioned rewards: When partial feedback counts

Delays in reinforcement learning

Multi-Armed Bandits with Generalized Temporally-Partitioned Rewards

Generalizing distribution of partial rewards for multi-armed bandits with temporally-partitioned rewards

Pricing and advertising strategies in e-commerce scenarios

Stochastic linear bandits with global-local structure

引用