Nested bandits

文章

学术资源搜索

获得 3 条结果（用时0.02秒）

我的图书馆

在引用文章中搜索

[PDF] arxiv.org

Off-policy evaluation of slate bandit policies via optimizing abstraction

H Kiyohara, M Nomura, Y Saito - Proceedings of the ACM on Web …, 2024 - dl.acm.org

We study off-policy evaluation (OPE) in the problem of slate contextual bandits where a
policy selects multi-dimensional actions known as slates. This problem is widespread in …

被引用次数：5 相关文章所有 4 个版本

[HTML] sciencedirect.com

[HTML][HTML] Nested replicator dynamics, nested logit choice, and similarity-based learning

P Mertikopoulos, WH Sandholm - Journal of Economic Theory, 2024 - Elsevier

We consider a model of learning and evolution in games whose action sets are endowed
with a partition-based similarity structure intended to capture exogenous similarities …

被引用次数：1 相关文章所有 13 个版本

[PDF] hal.science

Efficient methods in counterfactual policy learning and sequential decision making

H Zenati - 2023 - theses.hal.science

Because logged data has become ubiquitous in wide-range applications and since
onlineexploration may be sensitive, counterfactual methods have gained significant …

高级搜索

QQ 群

Nested bandits

Off-policy evaluation of slate bandit policies via optimizing abstraction

[HTML][HTML] Nested replicator dynamics, nested logit choice, and similarity-based learning

Efficient methods in counterfactual policy learning and sequential decision making

引用