Actor-critic policy optimization in a large-scale imperfect-information game

J Yao, W Liu, H Fu, Y Yang… - Advances in Neural …, 2024 - proceedings.neurips.cc

Abstract Policy-Space Response Oracles (PSRO) is an influential algorithm framework for
approximating a Nash Equilibrium (NE) in multi-agent non-transitive games. Many previous …

被引用次数：11 相关文章所有 5 个版本

[PDF] arxiv.org

A survey of decision making in adversarial games

X Li, M Meng, Y Hong, J Chen - Science China Information Sciences, 2024 - Springer

In many practical applications, such as poker, chess, drug interdiction, cybersecurity, and
national defense, players often have adversarial stances, ie, the selfish actions of each …

被引用次数：17 相关文章所有 3 个版本

[PDF] scichina.com

Learning in games: a systematic review

RJ Qin, Y Yu - Science China Information Sciences, 2024 - Springer

Game theory studies the mathematical models for self-interested individuals. Nash
equilibrium is arguably the most central solution in game theory. While finding the Nash …

被引用次数：1 相关文章所有 2 个版本

[PDF] neurips.cc

Perfectdou: Dominating doudizhu with perfect information distillation

G Yang, M Liu, W Hong, W Zhang… - Advances in …, 2022 - proceedings.neurips.cc

As a challenging multi-player card game, DouDizhu has recently drawn much attention for
analyzing competition and collaboration in imperfect-information games. In this paper, we …

被引用次数：31 相关文章所有 5 个版本

[PDF] arxiv.org

Escher: Eschewing importance sampling in games by computing a history value function to estimate regret

S McAleer, G Farina, M Lanctot, T Sandholm - arXiv preprint arXiv …, 2022 - arxiv.org

Recent techniques for approximating Nash equilibria in very large games leverage neural
networks to learn approximately optimal policies (strategies). One promising line of research …

被引用次数：21 相关文章所有 5 个版本

[PDF] neurips.cc

A robust and opponent-aware league training method for StarCraft II

R Huang, X Wu, H Yu, Z Fan, H Fu… - Advances in Neural …, 2024 - proceedings.neurips.cc

It is extremely difficult to train a superhuman Artificial Intelligence (AI) for games of similar
size to StarCraft II. AlphaStar is the first AI that beat human professionals in the full game of …

被引用次数：7 相关文章所有 3 个版本

[PDF] openreview.net

Quality-similar diversity via population based reinforcement learning

S Wu, J Yao, H Fu, Y Tian, C Qian, Y Yang… - The Eleventh …, 2023 - openreview.net

Diversity is a growing research topic in Reinforcement Learning (RL). Previous research on
diversity has mainly focused on promoting diversity to encourage exploration and thereby …

被引用次数：17 相关文章

[PDF] arxiv.org

Game-theoretic robust reinforcement learning handles temporally-coupled perturbations

Y Liang, Y Sun, R Zheng, X Liu, B Eysenbach… - arXiv preprint arXiv …, 2023 - arxiv.org

Deploying reinforcement learning (RL) systems requires robustness to uncertainty and
model misspecification, yet prior robust RL methods typically only study noise introduced …

被引用次数：5 相关文章所有 5 个版本

[PDF] aaai.org

An efficient deep reinforcement learning algorithm for solving imperfect information extensive-form games

L Meng, Z Ge, P Tian, B An, Y Gao - … of the AAAI Conference on Artificial …, 2023 - ojs.aaai.org

One of the most popular methods for learning Nash equilibrium (NE) in large-scale imperfect
information extensive-form games (IIEFGs) is the neural variants of counterfactual regret …

被引用次数：5 相关文章所有 2 个版本

[PDF] mlr.press

Greedy when sure and conservative when uncertain about the opponents

H Fu, Y Tian, H Yu, W Liu, S Wu… - International …, 2022 - proceedings.mlr.press

We develop a new approach, named Greedy when Sure and Conservative when Uncertain
(GSCU), to competing online against unknown and nonstationary opponents. GSCU …

被引用次数：12 相关文章所有 4 个版本

高级搜索

QQ 群