Multi-agent reinforcement learning: A selective overview of theories and algorithms

K Zhang, Z Yang, T Başar - Handbook of reinforcement learning and …, 2021 - Springer
Recent years have witnessed significant advances in reinforcement learning (RL), which
has registered tremendous success in solving various sequential decision-making problems …

An overview of multi-agent reinforcement learning from game theoretical perspective

Y Yang, J Wang - arXiv preprint arXiv:2011.00583, 2020 - arxiv.org
Following the remarkable success of the AlphaGO series, 2019 was a booming year that
witnessed significant advances in multi-agent reinforcement learning (MARL) techniques …

A modern introduction to online learning

F Orabona - arXiv preprint arXiv:1912.13213, 2019 - arxiv.org
In this monograph, I introduce the basic concepts of Online Learning through a modern view
of Online Convex Optimization. Here, online learning refers to the framework of regret …

OpenSpiel: A framework for reinforcement learning in games

M Lanctot, E Lockhart, JB Lespiau, V Zambaldi… - arXiv preprint arXiv …, 2019 - arxiv.org
OpenSpiel is a collection of environments and algorithms for research in general
reinforcement learning and search/planning in games. OpenSpiel supports n-player (single …

Deep counterfactual regret minimization

N Brown, A Lerer, S Gross… - … conference on machine …, 2019 - proceedings.mlr.press
Abstract Counterfactual Regret Minimization (CFR) is the leading algorithm for solving large
imperfect-information games. It converges to an equilibrium by iteratively traversing the …

Heads-up limit hold'em poker is solved

M Bowling, N Burch, M Johanson, O Tammelin - Science, 2015 - science.org
Poker is a family of games that exhibit imperfect information, where players do not have full
knowledge of past events. Whereas many perfect-information games have been solved (eg …

Student of Games: A unified learning algorithm for both perfect and imperfect information games

M Schmid, M Moravčík, N Burch, R Kadlec… - Science …, 2023 - science.org
Games have a long history as benchmarks for progress in artificial intelligence. Approaches
using search and learning produced strong performance across many perfect information …

Solving imperfect-information games via discounted regret minimization

N Brown, T Sandholm - Proceedings of the AAAI Conference on Artificial …, 2019 - aaai.org
Counterfactual regret minimization (CFR) is a family of iterative algorithms that are the most
popular and, in practice, fastest approach to approximately solving large …

Actor-critic policy optimization in partially observable multiagent environments

S Srinivasan, M Lanctot, V Zambaldi… - Advances in neural …, 2018 - proceedings.neurips.cc
Optimization of parameterized policies for reinforcement learning (RL) is an important and
challenging problem in artificial intelligence. Among the most common approaches are …

Safe and nested subgame solving for imperfect-information games

N Brown, T Sandholm - Advances in neural information …, 2017 - proceedings.neurips.cc
In imperfect-information games, the optimal strategy in a subgame may depend on the
strategy in other, unreached subgames. Thus a subgame cannot be solved in isolation and …