Mutual observability and the convergence of actions in a multi-person two-armed bandit model

AV den Boer, JM Meylahn… - Amsterdam Law School …, 2022 - papers.ssrn.com

We examine recent claims that a particular Q-learning algorithm used by
competitorsautonomously'and systematically learns to collude, resulting in supracompetitive …

被引用次数：27 相关文章所有 8 个版本

[PDF] kolotilin.com

The heterogeneity of concentrated prescribing behavior: Theory and evidence from antipsychotics

ER Berndt, RS Gibbons, A Kolotilin, AL Taub - Journal of health economics, 2015 - Elsevier

We present two new findings based on annual antipsychotic US prescribing data from IMS
Health on 2867 psychiatrists who wrote 50 or more prescriptions in 2007. First, many of …

被引用次数：48 相关文章所有 10 个版本

[HTML] sciencedirect.com

[HTML][HTML] Teaching and leading an ad hoc teammate: Collaboration without pre-coordination

P Stone, GA Kaminka, S Kraus, JS Rosenschein… - Artificial Intelligence, 2013 - Elsevier

As autonomous agents proliferate in the real world, both in software and robotic settings,
they will increasingly need to band together for cooperative activities with previously …

被引用次数：45 相关文章所有 16 个版本

[PDF] ssrn.com

On games of strategic experimentation

D Rosenberg, A Salomon, N Vieille - Games and Economic Behavior, 2013 - Elsevier

We study a class of symmetric strategic experimentation games. Each of two players faces
an (exponential) two-armed bandit problem, and must decide when to stop experimenting …

被引用次数：47 相关文章所有 17 个版本

[PDF] brunel.ac.uk

Strategic private experimentation

M Felgenhauer, E Schulte - American Economic Journal …, 2014 - aeaweb.org

We consider a model of persuasion in which an agent who tries to persuade a decision
maker can sequentially acquire imperfect signals. The agent's information acquisition is …

被引用次数：49 相关文章所有 9 个版本

[PDF] arxiv.org

Dueling Over Dessert, Mastering the Art of Repeated Cake Cutting

S Brânzei, MT Hajiaghayi, R Phillips, S Shin… - arXiv preprint arXiv …, 2024 - arxiv.org

We consider the setting of repeated fair division between two players, denoted Alice and
Bob, with private valuations over a cake. In each round, a new cake arrives, which is …

被引用次数：2 相关文章所有 3 个版本

A partial folk theorem for games with unknown payoff distributions

T Wiseman - Econometrica, 2005 - Wiley Online Library

Repeated games with unknown payoff distributions are analogous to a single decision
maker's “multi‐armed bandit” problem. Each state of the world corresponds to a different …

被引用次数：60 相关文章所有 7 个版本

[PDF] mlr.press

Multiplayer bandit learning, from competition to cooperation

S Brânzei, Y Peres - Conference on Learning Theory, 2021 - proceedings.mlr.press

The stochastic multi-armed bandit model captures the tradeoff between exploration and
exploitation. We study the effects of competition and cooperation on this tradeoff. Suppose …

被引用次数：11 相关文章所有 6 个版本

[PDF] academia.edu

[PDF][PDF] Ad hoc teamwork modeled with multi-armed bandits: An extension to discounted infinite rewards

S Barrett, P Stone - Proceedings of 2011 AAMAS workshop on …, 2011 - academia.edu

Before deployment, agents designed for multiagent team settings are commonly developed
together or are given standardized communication and coordination protocols. However, in …

被引用次数：22 相关文章所有 7 个版本

[PDF] utexas.edu

[图书][B] Making friends on the fly: advances in ad hoc teamwork

S Barrett - 2015 - Springer

Robots are becoming cheaper and more durable as manufacturing processes improve, and
they are becoming useful for an increasing number of tasks as artificial intelligence …

被引用次数：14 相关文章所有 10 个版本

高级搜索

QQ 群