Multi-agent reinforcement learning: A selective overview of theories and algorithms

K Zhang, Z Yang, T Başar - Handbook of reinforcement learning and …, 2021 - Springer
Recent years have witnessed significant advances in reinforcement learning (RL), which
has registered tremendous success in solving various sequential decision-making problems …

Solving a class of non-convex min-max games using iterative first order methods

M Nouiehed, M Sanjabi, T Huang… - Advances in …, 2019 - proceedings.neurips.cc
Recent applications that arise in machine learning have surged significant interest in solving
min-max saddle point games. This problem has been extensively studied in the convex …

Model-free opponent shaping

C Lu, T Willi, CAS De Witt… - … Conference on Machine …, 2022 - proceedings.mlr.press
In general-sum games the interaction of self-interested learning agents commonly leads to
collectively worst-case outcomes, such as defect-defect in the iterated prisoner's dilemma …

A unified approach to reinforcement learning, quantal response equilibria, and two-player zero-sum games

S Sokota, R D'Orazio, JZ Kolter, N Loizou… - arXiv preprint arXiv …, 2022 - arxiv.org
This work studies an algorithm, which we call magnetic mirror descent, that is inspired by
mirror descent and the non-Euclidean proximal gradient algorithm. Our contribution is …

Logan: Latent optimisation for generative adversarial networks

Y Wu, J Donahue, D Balduzzi, K Simonyan… - arXiv preprint arXiv …, 2019 - arxiv.org
Training generative adversarial networks requires balancing of delicate adversarial
dynamics. Even with careful tuning, training may diverge or end up in a bad equilibrium with …

COLA: consistent learning with opponent-learning awareness

T Willi, AH Letcher, J Treutlein… - … on Machine Learning, 2022 - proceedings.mlr.press
Learning in general-sum games is unstable and frequently leads to socially undesirable
(Pareto-dominated) outcomes. To mitigate this, Learning with Opponent-Learning …

Competitive gradient descent

F Schäfer, A Anandkumar - Advances in Neural Information …, 2019 - proceedings.neurips.cc
We introduce a new algorithm for the numerical computation of Nash equilibria of
competitive two-player games. Our method is a natural generalization of gradient descent to …

A unified single-loop alternating gradient projection algorithm for nonconvex–concave and convex–nonconcave minimax problems

Z Xu, H Zhang, Y Xu, G Lan - Mathematical Programming, 2023 - Springer
Much recent research effort has been directed to the development of efficient algorithms for
solving minimax problems with theoretical convergence guarantees due to the relevance of …

Near-optimal local convergence of alternating gradient descent-ascent for minimax optimization

G Zhang, Y Wang, L Lessard… - … Conference on Artificial …, 2022 - proceedings.mlr.press
Smooth minimax games often proceed by simultaneous or alternating gradient updates.
Although algorithms with alternating updates are commonly used in practice, the majority of …

Learning neural Hamiltonian dynamics: a methodological overview

Z Chen, M Feng, J Yan, H Zha - arXiv preprint arXiv:2203.00128, 2022 - arxiv.org
The past few years have witnessed an increased interest in learning Hamiltonian dynamics
in deep learning frameworks. As an inductive bias based on physical laws, Hamiltonian …