Multi-player zero-sum Markov games with networked separable interactions

C Park, K Zhang, A Ozdaglar - Advances in Neural …, 2024 - proceedings.neurips.cc
We study a new class of Markov games,\textit {(multi-player) zero-sum Markov Games} with
{\it Networked separable interactions}(zero-sum NMGs), to model the local interaction …

Scalable primal-dual actor-critic method for safe multi-agent rl with general utilities

D Ying, Y Zhang, Y Ding, A Koppel… - Advances in Neural …, 2024 - proceedings.neurips.cc
We investigate safe multi-agent reinforcement learning, where agents seek to collectively
maximize an aggregate sum of local objectives while satisfying their own safety constraints …

A finite-sample analysis of payoff-based independent learning in zero-sum stochastic games

Z Chen, K Zhang, E Mazumdar… - Advances in …, 2024 - proceedings.neurips.cc
In this work, we study two-player zero-sum stochastic games and develop a variant of the
smoothed best-response learning dynamics that combines independent learning dynamics …

Convergence rates for localized actor-critic in networked markov potential games

Z Zhou, Z Chen, Y Lin… - Uncertainty in Artificial …, 2023 - proceedings.mlr.press
We introduce a class of networked Markov potential games where agents are associated
with nodes in a network. Each agent has its own local potential function, and the reward of …

Finite-time frequentist regret bounds of multi-agent thompson sampling on sparse hypergraphs

T Jin, HL Hsu, W Chang, P Xu - … of the AAAI Conference on Artificial …, 2024 - ojs.aaai.org
We study the multi-agent multi-armed bandit (MAMAB) problem, where agents are factored
into overlapping groups. Each group represents a hyperedge, forming a hypergraph over …

Scalable and sample efficient distributed policy gradient algorithms in multi-agent networked systems

X Liu, H Wei, L Ying - arXiv preprint arXiv:2212.06357, 2022 - arxiv.org
This paper studies a class of multi-agent reinforcement learning (MARL) problems where the
reward that an agent receives depends on the states of other agents, but the next state only …

Learning Nash Equilibria in Zero-Sum Markov Games: A Single Time-scale Algorithm Under Weak Reachability

R Ouhamma, M Kamgarpour - arXiv preprint arXiv:2312.08008, 2023 - arxiv.org
We consider decentralized learning for zero-sum games, where players only see their payoff
information and are agnostic to actions and payoffs of the opponent. Previous works …

Randomized Exploration in Cooperative Multi-Agent Reinforcement Learning

HL Hsu, W Wang, M Pajic, P Xu - arXiv preprint arXiv:2404.10728, 2024 - arxiv.org
We present the first study on provably efficient randomized exploration in cooperative multi-
agent reinforcement learning (MARL). We propose a unified algorithm framework for …

Distributed Policy Gradient for Linear Quadratic Networked Control with Limited Communication Range

Y Yan, Y Shen - IEEE Transactions on Signal Processing, 2024 - ieeexplore.ieee.org
This paper proposes a scalable distributed policy gradient method and proves its
convergence to near-optimal solution in multi-agent linear quadratic networked systems …

Approximate Global Convergence of Independent Learning in Multi-Agent Systems

R Jin, Z Chen, Y Lin, J Song, A Wierman - arXiv preprint arXiv:2405.19811, 2024 - arxiv.org
Independent learning (IL), despite being a popular approach in practice to achieve
scalability in large-scale multi-agent systems, usually lacks global convergence guarantees …