Iterative empirical game solving via single policy best response

J Yao, W Liu, H Fu, Y Yang… - Advances in Neural …, 2024 - proceedings.neurips.cc

Abstract Policy-Space Response Oracles (PSRO) is an influential algorithm framework for
approximating a Nash Equilibrium (NE) in multi-agent non-transitive games. Many previous …

被引用次数：10 相关文章所有 5 个版本

Learning in games: a systematic review

RJ Qin, Y Yu - Science China Information Sciences, 2024 - Springer

Game theory studies the mathematical models for self-interested individuals. Nash
equilibrium is arguably the most central solution in game theory. While finding the Nash …

被引用次数：1 相关文章

[PDF] jmlr.org

Strategic knowledge transfer

MO Smith, T Anthony, MP Wellman - Journal of Machine Learning …, 2023 - jmlr.org

In the course of playing or solving a game, it is common to face a series of changing other-
agent strategies. These strategies often share elements: the set of possible policies to play …

被引用次数：3 相关文章

[PDF] arxiv.org

Neupl: Neural population learning

S Liu, L Marris, D Hennes, J Merel, N Heess… - arXiv preprint arXiv …, 2022 - arxiv.org

Learning in strategy games (eg StarCraft, poker) requires the discovery of diverse policies.
This is often achieved by iteratively training new policies against existing ones, growing a …

被引用次数：24 相关文章所有 4 个版本

[PDF] arxiv.org

A game-theoretic approach for improving generalization ability of TSP solvers

C Wang, Y Yang, O Slumbers, C Han, T Guo… - arXiv preprint arXiv …, 2021 - arxiv.org

In this paper, we introduce a two-player zero-sum framework between a trainable\emph
{Solver} and a\emph {Data Generator} to improve the generalization ability of deep learning …

被引用次数：15 相关文章所有 4 个版本

[PDF] arxiv.org

Neural Population Learning beyond Symmetric Zero-sum Games

S Liu, L Marris, M Lanctot, G Piliouras, JZ Leibo… - arXiv preprint arXiv …, 2024 - arxiv.org

We study computationally efficient methods for finding equilibria in n-player general-sum
games, specifically ones that afford complex visuomotor skills. We show how existing …

被引用次数：2 相关文章所有 5 个版本

[PDF] arxiv.org

Efficient policy space response oracles

M Zhou, J Chen, Y Wen, W Zhang, Y Yang, Y Yu… - arXiv preprint arXiv …, 2022 - arxiv.org

Policy Space Response Oracle methods (PSRO) provide a general solution to learn Nash
equilibrium in two-player zero-sum games but suffer from two drawbacks:(1) the computation …

被引用次数：13 相关文章所有 3 个版本

[PDF] arxiv.org

Co-Learning Empirical Games and World Models

MO Smith, MP Wellman - arXiv preprint arXiv:2305.14223, 2023 - arxiv.org

Game-based decision-making involves reasoning over both world dynamics and strategic
interactions among the agents. Typically, empirical models capturing these respective …

被引用次数：3 相关文章所有 4 个版本

[PDF] arxiv.org

Self-adaptive PSRO: Towards an Automatic Population-based Game Solver

P Li, S Li, C Yang, X Wang, X Huang, H Chan… - arXiv preprint arXiv …, 2024 - arxiv.org

Policy-Space Response Oracles (PSRO) as a general algorithmic framework has achieved
state-of-the-art performance in learning equilibrium policies of two-player zero-sum games …

被引用次数：1 相关文章所有 3 个版本

[PDF] frontiersin.org

Learning to play against any mixture of opponents

MO Smith, T Anthony, MP Wellman - Frontiers in Artificial Intelligence, 2023 - frontiersin.org

Intuitively, experience playing against one mixture of opponents in a given domain should
be relevant for a different mixture in the same domain. If the mixture changes, ideally we …

被引用次数：15 相关文章所有 6 个版本

高级搜索

QQ 群