Solving heads-up limit texas hold'em

K Zhang, Z Yang, T Başar - Handbook of reinforcement learning and …, 2021 - Springer

Recent years have witnessed significant advances in reinforcement learning (RL), which
has registered tremendous success in solving various sequential decision-making problems …

被引用次数：1638 相关文章所有 8 个版本

[PDF] arxiv.org

An overview of multi-agent reinforcement learning from game theoretical perspective

Y Yang, J Wang - arXiv preprint arXiv:2011.00583, 2020 - arxiv.org

Following the remarkable success of the AlphaGO series, 2019 was a booming year that
witnessed significant advances in multi-agent reinforcement learning (MARL) techniques …

被引用次数：340 相关文章所有 2 个版本

[PDF] arxiv.org

A modern introduction to online learning

F Orabona - arXiv preprint arXiv:1912.13213, 2019 - arxiv.org

In this monograph, I introduce the basic concepts of Online Learning through a modern view
of Online Convex Optimization. Here, online learning refers to the framework of regret …

被引用次数：406 相关文章所有 3 个版本

[PDF] arxiv.org

OpenSpiel: A framework for reinforcement learning in games

M Lanctot, E Lockhart, JB Lespiau, V Zambaldi… - arXiv preprint arXiv …, 2019 - arxiv.org

OpenSpiel is a collection of environments and algorithms for research in general
reinforcement learning and search/planning in games. OpenSpiel supports n-player (single …

被引用次数：286 相关文章所有 6 个版本

[PDF] mlr.press

Deep counterfactual regret minimization

N Brown, A Lerer, S Gross… - … conference on machine …, 2019 - proceedings.mlr.press

Abstract Counterfactual Regret Minimization (CFR) is the leading algorithm for solving large
imperfect-information games. It converges to an equilibrium by iteratively traversing the …

被引用次数：280 相关文章所有 7 个版本

[HTML] acm.org

Heads-up limit hold'em poker is solved

M Bowling, N Burch, M Johanson, O Tammelin - Science, 2015 - science.org

Poker is a family of games that exhibit imperfect information, where players do not have full
knowledge of past events. Whereas many perfect-information games have been solved (eg …

被引用次数：620 相关文章所有 15 个版本

[PDF] science.org Full View

Student of Games: A unified learning algorithm for both perfect and imperfect information games

M Schmid, M Moravčík, N Burch, R Kadlec… - Science …, 2023 - science.org

Games have a long history as benchmarks for progress in artificial intelligence. Approaches
using search and learning produced strong performance across many perfect information …

被引用次数：72 相关文章所有 8 个版本

[PDF] aaai.org

Solving imperfect-information games via discounted regret minimization

N Brown, T Sandholm - Proceedings of the AAAI Conference on Artificial …, 2019 - aaai.org

Counterfactual regret minimization (CFR) is a family of iterative algorithms that are the most
popular and, in practice, fastest approach to approximately solving large …

被引用次数：185 相关文章所有 13 个版本

[PDF] neurips.cc

Actor-critic policy optimization in partially observable multiagent environments

S Srinivasan, M Lanctot, V Zambaldi… - Advances in neural …, 2018 - proceedings.neurips.cc

Optimization of parameterized policies for reinforcement learning (RL) is an important and
challenging problem in artificial intelligence. Among the most common approaches are …

被引用次数：173 相关文章所有 9 个版本

[PDF] neurips.cc

Safe and nested subgame solving for imperfect-information games

N Brown, T Sandholm - Advances in neural information …, 2017 - proceedings.neurips.cc

In imperfect-information games, the optimal strategy in a subgame may depend on the
strategy in other, unreached subgames. Thus a subgame cannot be solved in isolation and …

被引用次数：196 相关文章所有 12 个版本

高级搜索

QQ 群