A survey on model-based reinforcement learning

FM Luo, T Xu, H Lai, XH Chen, W Zhang… - Science China Information …, 2024 - Springer
Reinforcement learning (RL) interacts with the environment to solve sequential decision-
making problems via a trial-and-error approach. Errors are always undesirable in real-world …

Model-based multi-agent reinforcement learning: Recent progress and prospects

X Wang, Z Zhang, W Zhang - arXiv preprint arXiv:2203.10603, 2022 - arxiv.org
Significant advances have recently been achieved in Multi-Agent Reinforcement Learning
(MARL) which tackles sequential decision-making problems involving multiple participants …

Multi-player zero-sum Markov games with networked separable interactions

C Park, K Zhang, A Ozdaglar - Advances in Neural …, 2024 - proceedings.neurips.cc
We study a new class of Markov games,\textit {(multi-player) zero-sum Markov Games} with
{\it Networked separable interactions}(zero-sum NMGs), to model the local interaction …

Provably efficient offline multi-agent reinforcement learning via strategy-wise bonus

Q Cui, SS Du - Advances in Neural Information Processing …, 2022 - proceedings.neurips.cc
This paper considers offline multi-agent reinforcement learning. We propose the strategy-
wise concentration principle which directly builds a confidence interval for the joint strategy …

Multi-agent reinforcement learning with reward delays

Y Zhang, R Zhang, Y Gu, N Li - Learning for Dynamics and …, 2023 - proceedings.mlr.press
This paper considers multi-agent reinforcement learning (MARL) where the rewards are
received after delays and the delay time varies across agents and across time steps. Based …

Offline congestion games: How feedback type affects data coverage requirement

H Jiang, Q Cui, Z Xiong, M Fazel, SS Du - arXiv preprint arXiv:2210.13396, 2022 - arxiv.org
This paper investigates when one can efficiently recover an approximate Nash Equilibrium
(NE) in offline congestion games. The existing dataset coverage assumption in offline …

The Danger Of Arrogance: Welfare Equilibra As A Solution To Stackelberg Self-Play In Non-Coincidental Games

J Levi, C Lu, T Willi, CS de Witt, J Foerster - arXiv preprint arXiv …, 2024 - arxiv.org
The increasing prevalence of multi-agent learning systems in society necessitates
understanding how to learn effective and safe policies in general-sum multi-agent …

Robustness of Stochastic Optimal Control to Approximate Diffusion Models Under Several Cost Evaluation Criteria

S Pradhan, S Yüksel - Mathematics of Operations Research, 2023 - pubsonline.informs.org
In control theory, typically a nominal model is assumed based on which an optimal control is
designed and then applied to an actual (true) system. This gives rise to the problem of …

Multi-Agent Dynamic Decision Making and Learning

K Avrachenkov, VS Borkar, UJ Nair - Dynamic Games and Applications, 2023 - Springer
As a large and ever-increasing part of our economic and social interactions moves to the
cyberspace, data-driven algorithmic decision making by autonomous agents is fast …

[引用][C] A Weighted Mean Field Reinforcement Learning Algorithm for Large-Scale Multi-Agent Collaboration

X Yuan, H Wang, W Yu - Guidance, Navigation and Control, 2023 - World Scientific
Reinforcement learning has been proven to be an effective approach for solving multi-agent
coordination problems in a dynamic open environment. For dealing with multi-agent …