Independent policy gradient for large-scale markov potential games: Sharper rates, function approximation, and game-agnostic convergence

D Ding, CY Wei, K Zhang… - … Conference on Machine …, 2022 - proceedings.mlr.press
We examine global non-asymptotic convergence properties of policy gradient methods for
multi-agent reinforcement learning (RL) problems in Markov potential games (MPGs). To …

On improving model-free algorithms for decentralized multi-agent reinforcement learning

W Mao, L Yang, K Zhang… - … Conference on Machine …, 2022 - proceedings.mlr.press
Multi-agent reinforcement learning (MARL) algorithms often suffer from an exponential
sample complexity dependence on the number of agents, a phenomenon known as the …

[HTML][HTML] Offline pre-trained multi-agent decision transformer

L Meng, M Wen, C Le, X Li, D Xing, W Zhang… - Machine Intelligence …, 2023 - Springer
Offline reinforcement learning leverages previously collected offline datasets to learn
optimal policies with no necessity to access the real environment. Such a paradigm is also …

Independent natural policy gradient always converges in markov potential games

R Fox, SM Mcaleer, W Overman… - International …, 2022 - proceedings.mlr.press
Natural policy gradient has emerged as one of the most successful algorithms for computing
optimal policies in challenging Reinforcement Learning (RL) tasks, yet, very little was known …

Settling the variance of multi-agent policy gradients

JG Kuba, M Wen, L Meng, H Zhang… - Advances in …, 2021 - proceedings.neurips.cc
Policy gradient (PG) methods are popular reinforcement learning (RL) methods where a
baseline is often applied to reduce the variance of gradient estimates. In multi-agent RL …

[PDF][PDF] Heterogeneous-agent reinforcement learning

Y Zhong, JG Kuba, X Feng, S Hu, J Ji, Y Yang - Journal of Machine …, 2024 - jmlr.org
The necessity for cooperation among intelligent machines has popularised cooperative multi-
agent reinforcement learning (MARL) in AI research. However, many research endeavours …

Policy diagnosis via measuring role diversity in cooperative multi-agent RL

S Hu, C Xie, X Liang, X Chang - International Conference on …, 2022 - proceedings.mlr.press
Cooperative multi-agent reinforcement learning (MARL) is making rapid progress for solving
tasks in a grid world and real-world scenarios, in which agents are given different attributes …

Provably fast convergence of independent natural policy gradient for markov potential games

Y Sun, T Liu, R Zhou, PR Kumar… - Advances in Neural …, 2023 - proceedings.neurips.cc
This work studies an independent natural policy gradient (NPG) algorithm for the multi-agent
reinforcement learning problem in Markov potential games. It is shown that, under mild …

Gradient play in stochastic games: Stationary points and local geometry

RC Zhang, Z Ren, N Li - IFAC-PapersOnLine, 2022 - Elsevier
We study the stationary points and local geometry of gradient play for stochastic games
(SGs), where each agent tries to maximize its own total discounted reward by making …

Exploration-exploitation in multi-agent learning: Catastrophe theory meets game theory

S Leonardos, G Piliouras - Artificial Intelligence, 2022 - Elsevier
Exploration-exploitation is a powerful and practical tool in multi-agent learning (MAL);
however, its effects are far from understood. To make progress in this direction, we study a …