Poincaré recurrence, cycles and spurious equilibria in gradient-descent-ascent for non-convex...

F Farnia, A Ozdaglar - International Conference on Machine …, 2020 - proceedings.mlr.press

Generative adversarial networks (GANs) represent a zero-sum game between two machine
players, a generator and a discriminator, designed to learn the distribution of data. While …

被引用次数：125 相关文章所有 4 个版本

[PDF] mlr.press

The limits of min-max optimization algorithms: Convergence to spurious non-critical sets

YP Hsieh, P Mertikopoulos… - … Conference on Machine …, 2021 - proceedings.mlr.press

Compared to minimization, the min-max optimization in machine learning applications is
considerably more convoluted because of the existence of cycles and similar phenomena …

被引用次数：110 相关文章所有 13 个版本

[PDF] arxiv.org

A unified single-loop alternating gradient projection algorithm for nonconvex–concave and convex–nonconcave minimax problems

Z Xu, H Zhang, Y Xu, G Lan - Mathematical Programming, 2023 - Springer

Much recent research effort has been directed to the development of efficient algorithms for
solving minimax problems with theoretical convergence guarantees due to the relevance of …

被引用次数：108 相关文章所有 7 个版本

[PDF] neurips.cc

Explore aggressively, update conservatively: Stochastic extragradient methods with variable stepsize scaling

YG Hsieh, F Iutzeler, J Malick… - Advances in Neural …, 2020 - proceedings.neurips.cc

Owing to their stability and convergence speed, extragradient methods have become a
staple for solving large-scale saddle-point problems in machine learning. The basic premise …

被引用次数：89 相关文章所有 17 个版本

[PDF] arxiv.org

A survey of decision making in adversarial games

X Li, M Meng, Y Hong, J Chen - Science China Information Sciences, 2024 - Springer

In many practical applications, such as poker, chess, drug interdiction, cybersecurity, and
national defense, players often have adversarial stances, ie, the selfish actions of each …

被引用次数：19 相关文章所有 3 个版本

[PDF] aaai.org

Stackelberg actor-critic: Game-theoretic reinforcement learning algorithms

L Zheng, T Fiez, Z Alumbaugh, B Chasnov… - Proceedings of the AAAI …, 2022 - ojs.aaai.org

The hierarchical interaction between the actor and critic in actor-critic based reinforcement
learning algorithms naturally lends itself to a game-theoretic interpretation. We adopt this …

被引用次数：49 相关文章所有 9 个版本

[PDF] neurips.cc

Derivative-free policy optimization for linear risk-sensitive and robust control design: Implicit regularization and sample complexity

K Zhang, X Zhang, B Hu… - Advances in neural …, 2021 - proceedings.neurips.cc

Direct policy search serves as one of the workhorses in modern reinforcement learning (RL),
and its applications in continuous control tasks have recently attracted increasing attention …

被引用次数：46 相关文章所有 9 个版本

[PDF] arxiv.org

Adaptive extra-gradient methods for min-max optimization and games

K Antonakopoulos, EV Belmega… - arXiv preprint arXiv …, 2020 - arxiv.org

We present a new family of min-max optimization algorithms that automatically exploit the
geometry of the gradient data observed at earlier iterations to perform more informative extra …

被引用次数：56 相关文章所有 20 个版本

[PDF] mlr.press

Symmetric (optimistic) natural policy gradient for multi-agent learning with parameter convergence

S Pattathil, K Zhang, A Ozdaglar - … Conference on Artificial …, 2023 - proceedings.mlr.press

Multi-agent interactions are increasingly important in the context of reinforcement learning,
and the theoretical foundations of policy gradient methods have attracted surging research …

被引用次数：14 相关文章所有 3 个版本

[PDF] neurips.cc

No-regret learning and mixed nash equilibria: They do not mix

EV Vlatakis-Gkaragkounis, L Flokas… - Advances in …, 2020 - proceedings.neurips.cc

Understanding the behavior of no-regret dynamics in general N-player games is a
fundamental question in online learning and game theory. A folk result in the field states that …

被引用次数：48 相关文章所有 3 个版本

高级搜索

QQ 群