Global convergence of localized policy iteration in networked multi-agent reinforcement learning

C Park, K Zhang, A Ozdaglar - Advances in Neural …, 2024 - proceedings.neurips.cc

We study a new class of Markov games,\textit {(multi-player) zero-sum Markov Games} with
{\it Networked separable interactions}(zero-sum NMGs), to model the local interaction …

被引用次数：6 相关文章所有 6 个版本

[PDF] neurips.cc

Scalable primal-dual actor-critic method for safe multi-agent rl with general utilities

D Ying, Y Zhang, Y Ding, A Koppel… - Advances in Neural …, 2024 - proceedings.neurips.cc

We investigate safe multi-agent reinforcement learning, where agents seek to collectively
maximize an aggregate sum of local objectives while satisfying their own safety constraints …

被引用次数：6 相关文章所有 6 个版本

[PDF] neurips.cc

A finite-sample analysis of payoff-based independent learning in zero-sum stochastic games

Z Chen, K Zhang, E Mazumdar… - Advances in …, 2024 - proceedings.neurips.cc

In this work, we study two-player zero-sum stochastic games and develop a variant of the
smoothed best-response learning dynamics that combines independent learning dynamics …

被引用次数：9 相关文章所有 8 个版本

[PDF] mlr.press

Convergence rates for localized actor-critic in networked markov potential games

Z Zhou, Z Chen, Y Lin… - Uncertainty in Artificial …, 2023 - proceedings.mlr.press

We introduce a class of networked Markov potential games where agents are associated
with nodes in a network. Each agent has its own local potential function, and the reward of …

被引用次数：7 相关文章所有 8 个版本

[PDF] aaai.org

Finite-time frequentist regret bounds of multi-agent thompson sampling on sparse hypergraphs

T Jin, HL Hsu, W Chang, P Xu - … of the AAAI Conference on Artificial …, 2024 - ojs.aaai.org

We study the multi-agent multi-armed bandit (MAMAB) problem, where agents are factored
into overlapping groups. Each group represents a hyperedge, forming a hypergraph over …

被引用次数：3 相关文章所有 4 个版本

[PDF] arxiv.org

Scalable and sample efficient distributed policy gradient algorithms in multi-agent networked systems

X Liu, H Wei, L Ying - arXiv preprint arXiv:2212.06357, 2022 - arxiv.org

This paper studies a class of multi-agent reinforcement learning (MARL) problems where the
reward that an agent receives depends on the states of other agents, but the next state only …

被引用次数：7 相关文章所有 2 个版本

[PDF] arxiv.org

Learning Nash Equilibria in Zero-Sum Markov Games: A Single Time-scale Algorithm Under Weak Reachability

R Ouhamma, M Kamgarpour - arXiv preprint arXiv:2312.08008, 2023 - arxiv.org

We consider decentralized learning for zero-sum games, where players only see their payoff
information and are agnostic to actions and payoffs of the opponent. Previous works …

被引用次数：1 相关文章所有 3 个版本

[PDF] arxiv.org

Randomized Exploration in Cooperative Multi-Agent Reinforcement Learning

HL Hsu, W Wang, M Pajic, P Xu - arXiv preprint arXiv:2404.10728, 2024 - arxiv.org

We present the first study on provably efficient randomized exploration in cooperative multi-
agent reinforcement learning (MARL). We propose a unified algorithm framework for …

被引用次数：1 相关文章所有 2 个版本

[PDF] arxiv.org

Distributed Policy Gradient for Linear Quadratic Networked Control with Limited Communication Range

Y Yan, Y Shen - IEEE Transactions on Signal Processing, 2024 - ieeexplore.ieee.org

This paper proposes a scalable distributed policy gradient method and proves its
convergence to near-optimal solution in multi-agent linear quadratic networked systems …

Approximate Global Convergence of Independent Learning in Multi-Agent Systems

R Jin, Z Chen, Y Lin, J Song, A Wierman - arXiv preprint arXiv:2405.19811, 2024 - arxiv.org

Independent learning (IL), despite being a popular approach in practice to achieve
scalability in large-scale multi-agent systems, usually lacks global convergence guarantees …

高级搜索

QQ 群