Y Yang, J Wang - arXiv preprint arXiv:2011.00583, 2020 - arxiv.org
Following the remarkable success of the AlphaGO series, 2019 was a booming year that witnessed significant advances in multi-agent reinforcement learning (MARL) techniques …
F Orabona - arXiv preprint arXiv:1912.13213, 2019 - arxiv.org
In this monograph, I introduce the basic concepts of Online Learning through a modern view of Online Convex Optimization. Here, online learning refers to the framework of regret …
OpenSpiel is a collection of environments and algorithms for research in general reinforcement learning and search/planning in games. OpenSpiel supports n-player (single …
N Brown, A Lerer, S Gross… - … conference on machine …, 2019 - proceedings.mlr.press
Abstract Counterfactual Regret Minimization (CFR) is the leading algorithm for solving large imperfect-information games. It converges to an equilibrium by iteratively traversing the …
Poker is a family of games that exhibit imperfect information, where players do not have full knowledge of past events. Whereas many perfect-information games have been solved (eg …
Games have a long history as benchmarks for progress in artificial intelligence. Approaches using search and learning produced strong performance across many perfect information …
N Brown, T Sandholm - Proceedings of the AAAI Conference on Artificial …, 2019 - aaai.org
Counterfactual regret minimization (CFR) is a family of iterative algorithms that are the most popular and, in practice, fastest approach to approximately solving large …
Optimization of parameterized policies for reinforcement learning (RL) is an important and challenging problem in artificial intelligence. Among the most common approaches are …
N Brown, T Sandholm - Advances in neural information …, 2017 - proceedings.neurips.cc
In imperfect-information games, the optimal strategy in a subgame may depend on the strategy in other, unreached subgames. Thus a subgame cannot be solved in isolation and …