Policy space diversity for non-transitive games

J Yao, W Liu, H Fu, Y Yang… - Advances in Neural …, 2024 - proceedings.neurips.cc
Abstract Policy-Space Response Oracles (PSRO) is an influential algorithm framework for
approximating a Nash Equilibrium (NE) in multi-agent non-transitive games. Many previous …

A survey of decision making in adversarial games

X Li, M Meng, Y Hong, J Chen - Science China Information Sciences, 2024 - Springer
In many practical applications, such as poker, chess, drug interdiction, cybersecurity, and
national defense, players often have adversarial stances, ie, the selfish actions of each …

Learning in games: a systematic review

RJ Qin, Y Yu - Science China Information Sciences, 2024 - Springer
Game theory studies the mathematical models for self-interested individuals. Nash
equilibrium is arguably the most central solution in game theory. While finding the Nash …

Perfectdou: Dominating doudizhu with perfect information distillation

G Yang, M Liu, W Hong, W Zhang… - Advances in …, 2022 - proceedings.neurips.cc
As a challenging multi-player card game, DouDizhu has recently drawn much attention for
analyzing competition and collaboration in imperfect-information games. In this paper, we …

Escher: Eschewing importance sampling in games by computing a history value function to estimate regret

S McAleer, G Farina, M Lanctot, T Sandholm - arXiv preprint arXiv …, 2022 - arxiv.org
Recent techniques for approximating Nash equilibria in very large games leverage neural
networks to learn approximately optimal policies (strategies). One promising line of research …

A robust and opponent-aware league training method for StarCraft II

R Huang, X Wu, H Yu, Z Fan, H Fu… - Advances in Neural …, 2024 - proceedings.neurips.cc
It is extremely difficult to train a superhuman Artificial Intelligence (AI) for games of similar
size to StarCraft II. AlphaStar is the first AI that beat human professionals in the full game of …

Quality-similar diversity via population based reinforcement learning

S Wu, J Yao, H Fu, Y Tian, C Qian, Y Yang… - The Eleventh …, 2023 - openreview.net
Diversity is a growing research topic in Reinforcement Learning (RL). Previous research on
diversity has mainly focused on promoting diversity to encourage exploration and thereby …

Game-theoretic robust reinforcement learning handles temporally-coupled perturbations

Y Liang, Y Sun, R Zheng, X Liu, B Eysenbach… - arXiv preprint arXiv …, 2023 - arxiv.org
Deploying reinforcement learning (RL) systems requires robustness to uncertainty and
model misspecification, yet prior robust RL methods typically only study noise introduced …

An efficient deep reinforcement learning algorithm for solving imperfect information extensive-form games

L Meng, Z Ge, P Tian, B An, Y Gao - … of the AAAI Conference on Artificial …, 2023 - ojs.aaai.org
One of the most popular methods for learning Nash equilibrium (NE) in large-scale imperfect
information extensive-form games (IIEFGs) is the neural variants of counterfactual regret …

Greedy when sure and conservative when uncertain about the opponents

H Fu, Y Tian, H Yu, W Liu, S Wu… - International …, 2022 - proceedings.mlr.press
We develop a new approach, named Greedy when Sure and Conservative when Uncertain
(GSCU), to competing online against unknown and nonstationary opponents. GSCU …