Learning in nonzero-sum stochastic games with potentials

Independent policy gradient for large-scale markov potential games: Sharper rates, function approximation, and game-agnostic convergence

D Ding, CY Wei, K Zhang… - … Conference on Machine …, 2022 - proceedings.mlr.press

We examine global non-asymptotic convergence properties of policy gradient methods for
multi-agent reinforcement learning (RL) problems in Markov potential games (MPGs). To …

被引用次数：66 相关文章所有 10 个版本

[PDF] mlr.press

On improving model-free algorithms for decentralized multi-agent reinforcement learning

W Mao, L Yang, K Zhang… - … Conference on Machine …, 2022 - proceedings.mlr.press

Multi-agent reinforcement learning (MARL) algorithms often suffer from an exponential
sample complexity dependence on the number of agents, a phenomenon known as the …

被引用次数：55 相关文章所有 6 个版本

[HTML] springer.com

[HTML][HTML] Offline pre-trained multi-agent decision transformer

L Meng, M Wen, C Le, X Li, D Xing, W Zhang… - Machine Intelligence …, 2023 - Springer

Offline reinforcement learning leverages previously collected offline datasets to learn
optimal policies with no necessity to access the real environment. Such a paradigm is also …

被引用次数：59 相关文章所有 7 个版本

[PDF] mlr.press

Independent natural policy gradient always converges in markov potential games

R Fox, SM Mcaleer, W Overman… - International …, 2022 - proceedings.mlr.press

Natural policy gradient has emerged as one of the most successful algorithms for computing
optimal policies in challenging Reinforcement Learning (RL) tasks, yet, very little was known …

被引用次数：47 相关文章所有 9 个版本

[PDF] neurips.cc

Settling the variance of multi-agent policy gradients

JG Kuba, M Wen, L Meng, H Zhang… - Advances in …, 2021 - proceedings.neurips.cc

Policy gradient (PG) methods are popular reinforcement learning (RL) methods where a
baseline is often applied to reduce the variance of gradient estimates. In multi-agent RL …

被引用次数：50 相关文章所有 9 个版本

[PDF] jmlr.org

[PDF][PDF] Heterogeneous-agent reinforcement learning

Y Zhong, JG Kuba, X Feng, S Hu, J Ji, Y Yang - Journal of Machine …, 2024 - jmlr.org

The necessity for cooperation among intelligent machines has popularised cooperative multi-
agent reinforcement learning (MARL) in AI research. However, many research endeavours …

被引用次数：12 相关文章所有 3 个版本

[PDF] mlr.press

Policy diagnosis via measuring role diversity in cooperative multi-agent RL

S Hu, C Xie, X Liang, X Chang - International Conference on …, 2022 - proceedings.mlr.press

Cooperative multi-agent reinforcement learning (MARL) is making rapid progress for solving
tasks in a grid world and real-world scenarios, in which agents are given different attributes …

被引用次数：22 相关文章所有 5 个版本

[PDF] neurips.cc

Provably fast convergence of independent natural policy gradient for markov potential games

Y Sun, T Liu, R Zhou, PR Kumar… - Advances in Neural …, 2023 - proceedings.neurips.cc

This work studies an independent natural policy gradient (NPG) algorithm for the multi-agent
reinforcement learning problem in Markov potential games. It is shown that, under mild …

被引用次数：6 相关文章所有 5 个版本

[PDF] arxiv.org

Gradient play in stochastic games: Stationary points and local geometry

RC Zhang, Z Ren, N Li - IFAC-PapersOnLine, 2022 - Elsevier

We study the stationary points and local geometry of gradient play for stochastic games
(SGs), where each agent tries to maximize its own total discounted reward by making …

被引用次数：43 相关文章所有 5 个版本

[PDF] sciencedirect.com

Exploration-exploitation in multi-agent learning: Catastrophe theory meets game theory

S Leonardos, G Piliouras - Artificial Intelligence, 2022 - Elsevier

Exploration-exploitation is a powerful and practical tool in multi-agent learning (MAL);
however, its effects are far from understood. To make progress in this direction, we study a …

被引用次数：41 相关文章所有 10 个版本

高级搜索

QQ 群