W Mao, L Yang, K Zhang… - … Conference on Machine …, 2022 - proceedings.mlr.press
Multi-agent reinforcement learning (MARL) algorithms often suffer from an exponential sample complexity dependence on the number of agents, a phenomenon known as the …
Offline reinforcement learning leverages previously collected offline datasets to learn optimal policies with no necessity to access the real environment. Such a paradigm is also …
Natural policy gradient has emerged as one of the most successful algorithms for computing optimal policies in challenging Reinforcement Learning (RL) tasks, yet, very little was known …
Policy gradient (PG) methods are popular reinforcement learning (RL) methods where a baseline is often applied to reduce the variance of gradient estimates. In multi-agent RL …
The necessity for cooperation among intelligent machines has popularised cooperative multi- agent reinforcement learning (MARL) in AI research. However, many research endeavours …
Cooperative multi-agent reinforcement learning (MARL) is making rapid progress for solving tasks in a grid world and real-world scenarios, in which agents are given different attributes …
This work studies an independent natural policy gradient (NPG) algorithm for the multi-agent reinforcement learning problem in Markov potential games. It is shown that, under mild …
We study the stationary points and local geometry of gradient play for stochastic games (SGs), where each agent tries to maximize its own total discounted reward by making …
Exploration-exploitation is a powerful and practical tool in multi-agent learning (MAL); however, its effects are far from understood. To make progress in this direction, we study a …