Policy space diversity for non-transitive games

J Yao, W Liu, H Fu, Y Yang… - Advances in Neural …, 2024 - proceedings.neurips.cc
Abstract Policy-Space Response Oracles (PSRO) is an influential algorithm framework for
approximating a Nash Equilibrium (NE) in multi-agent non-transitive games. Many previous …

Learning in games: a systematic review

RJ Qin, Y Yu - Science China Information Sciences, 2024 - Springer
Game theory studies the mathematical models for self-interested individuals. Nash
equilibrium is arguably the most central solution in game theory. While finding the Nash …

Strategic knowledge transfer

MO Smith, T Anthony, MP Wellman - Journal of Machine Learning …, 2023 - jmlr.org
In the course of playing or solving a game, it is common to face a series of changing other-
agent strategies. These strategies often share elements: the set of possible policies to play …

Neupl: Neural population learning

S Liu, L Marris, D Hennes, J Merel, N Heess… - arXiv preprint arXiv …, 2022 - arxiv.org
Learning in strategy games (eg StarCraft, poker) requires the discovery of diverse policies.
This is often achieved by iteratively training new policies against existing ones, growing a …

A game-theoretic approach for improving generalization ability of TSP solvers

C Wang, Y Yang, O Slumbers, C Han, T Guo… - arXiv preprint arXiv …, 2021 - arxiv.org
In this paper, we introduce a two-player zero-sum framework between a trainable\emph
{Solver} and a\emph {Data Generator} to improve the generalization ability of deep learning …

Neural Population Learning beyond Symmetric Zero-sum Games

S Liu, L Marris, M Lanctot, G Piliouras, JZ Leibo… - arXiv preprint arXiv …, 2024 - arxiv.org
We study computationally efficient methods for finding equilibria in n-player general-sum
games, specifically ones that afford complex visuomotor skills. We show how existing …

Efficient policy space response oracles

M Zhou, J Chen, Y Wen, W Zhang, Y Yang, Y Yu… - arXiv preprint arXiv …, 2022 - arxiv.org
Policy Space Response Oracle methods (PSRO) provide a general solution to learn Nash
equilibrium in two-player zero-sum games but suffer from two drawbacks:(1) the computation …

Co-Learning Empirical Games and World Models

MO Smith, MP Wellman - arXiv preprint arXiv:2305.14223, 2023 - arxiv.org
Game-based decision-making involves reasoning over both world dynamics and strategic
interactions among the agents. Typically, empirical models capturing these respective …

Self-adaptive PSRO: Towards an Automatic Population-based Game Solver

P Li, S Li, C Yang, X Wang, X Huang, H Chan… - arXiv preprint arXiv …, 2024 - arxiv.org
Policy-Space Response Oracles (PSRO) as a general algorithmic framework has achieved
state-of-the-art performance in learning equilibrium policies of two-player zero-sum games …

Learning to play against any mixture of opponents

MO Smith, T Anthony, MP Wellman - Frontiers in Artificial Intelligence, 2023 - frontiersin.org
Intuitively, experience playing against one mixture of opponents in a given domain should
be relevant for a different mixture in the same domain. If the mixture changes, ideally we …