Differentiable game mechanics

K Zhang, Z Yang, T Başar - Handbook of reinforcement learning and …, 2021 - Springer

Recent years have witnessed significant advances in reinforcement learning (RL), which
has registered tremendous success in solving various sequential decision-making problems …

被引用次数：1345 相关文章所有 8 个版本

[PDF] neurips.cc

Solving a class of non-convex min-max games using iterative first order methods

M Nouiehed, M Sanjabi, T Huang… - Advances in …, 2019 - proceedings.neurips.cc

Recent applications that arise in machine learning have surged significant interest in solving
min-max saddle point games. This problem has been extensively studied in the convex …

被引用次数：348 相关文章所有 10 个版本

[PDF] mlr.press

Model-free opponent shaping

C Lu, T Willi, CAS De Witt… - … Conference on Machine …, 2022 - proceedings.mlr.press

In general-sum games the interaction of self-interested learning agents commonly leads to
collectively worst-case outcomes, such as defect-defect in the iterated prisoner's dilemma …

被引用次数：43 相关文章所有 5 个版本

[PDF] arxiv.org

A unified approach to reinforcement learning, quantal response equilibria, and two-player zero-sum games

S Sokota, R D'Orazio, JZ Kolter, N Loizou… - arXiv preprint arXiv …, 2022 - arxiv.org

This work studies an algorithm, which we call magnetic mirror descent, that is inspired by
mirror descent and the non-Euclidean proximal gradient algorithm. Our contribution is …

被引用次数：46 相关文章所有 4 个版本

[PDF] arxiv.org

Logan: Latent optimisation for generative adversarial networks

Y Wu, J Donahue, D Balduzzi, K Simonyan… - arXiv preprint arXiv …, 2019 - arxiv.org

Training generative adversarial networks requires balancing of delicate adversarial
dynamics. Even with careful tuning, training may diverge or end up in a bad equilibrium with …

被引用次数：109 相关文章所有 4 个版本

[PDF] mlr.press

COLA: consistent learning with opponent-learning awareness

T Willi, AH Letcher, J Treutlein… - … on Machine Learning, 2022 - proceedings.mlr.press

Learning in general-sum games is unstable and frequently leads to socially undesirable
(Pareto-dominated) outcomes. To mitigate this, Learning with Opponent-Learning …

被引用次数：39 相关文章所有 5 个版本

[PDF] neurips.cc

Competitive gradient descent

F Schäfer, A Anandkumar - Advances in Neural Information …, 2019 - proceedings.neurips.cc

We introduce a new algorithm for the numerical computation of Nash equilibria of
competitive two-player games. Our method is a natural generalization of gradient descent to …

被引用次数：118 相关文章所有 12 个版本

[PDF] arxiv.org

A unified single-loop alternating gradient projection algorithm for nonconvex–concave and convex–nonconcave minimax problems

Z Xu, H Zhang, Y Xu, G Lan - Mathematical Programming, 2023 - Springer

Much recent research effort has been directed to the development of efficient algorithms for
solving minimax problems with theoretical convergence guarantees due to the relevance of …

被引用次数：87 相关文章所有 7 个版本

[PDF] mlr.press

Near-optimal local convergence of alternating gradient descent-ascent for minimax optimization

G Zhang, Y Wang, L Lessard… - … Conference on Artificial …, 2022 - proceedings.mlr.press

Smooth minimax games often proceed by simultaneous or alternating gradient updates.
Although algorithms with alternating updates are commonly used in practice, the majority of …

被引用次数：55 相关文章所有 5 个版本

[PDF] arxiv.org

Learning neural Hamiltonian dynamics: a methodological overview

Z Chen, M Feng, J Yan, H Zha - arXiv preprint arXiv:2203.00128, 2022 - arxiv.org

The past few years have witnessed an increased interest in learning Hamiltonian dynamics
in deep learning frameworks. As an inductive bias based on physical laws, Hamiltonian …

被引用次数：10 相关文章所有 2 个版本

高级搜索

QQ 群

Multi-agent reinforcement learning: A selective overview of theories and algorithms

Solving a class of non-convex min-max games using iterative first order methods

Model-free opponent shaping

A unified approach to reinforcement learning, quantal response equilibria, and two-player zero-sum games

Logan: Latent optimisation for generative adversarial networks

COLA: consistent learning with opponent-learning awareness

Competitive gradient descent

A unified single-loop alternating gradient projection algorithm for nonconvex–concave and convex–nonconcave minimax problems

Near-optimal local convergence of alternating gradient descent-ascent for minimax optimization

Learning neural Hamiltonian dynamics: a methodological overview

引用