Dealing with non-stationarity in multi-agent deep reinforcement learning

G Papoudakis, F Christianos, A Rahman… - arXiv preprint arXiv …, 2019 - arxiv.org
Recent developments in deep reinforcement learning are concerned with creating decision-
making agents which can perform well in various complex domains. A particular approach …

Decentralized Q-learning in zero-sum Markov games

M Sayin, K Zhang, D Leslie, T Basar… - Advances in Neural …, 2021 - proceedings.neurips.cc
We study multi-agent reinforcement learning (MARL) in infinite-horizon discounted zero-sum
Markov games. We focus on the practical but challenging setting of decentralized MARL …

Adversarial environment reinforcement learning algorithm for intrusion detection

G Caminero, M Lopez-Martin, B Carro - Computer Networks, 2019 - Elsevier
Intrusion detection is a crucial service in today's data networks, and the search for new fast
and robust algorithms that are capable of detecting and classifying dangerous traffic is …

Multi-objective multi-agent decision making: a utility-based analysis and survey

R Rădulescu, P Mannion, DM Roijers… - Autonomous Agents and …, 2020 - Springer
The majority of multi-agent system implementations aim to optimise agents' policies with
respect to a single objective, despite the fact that many real-world problem domains are …

Lenient multi-agent deep reinforcement learning

G Palmer, K Tuyls, D Bloembergen… - arXiv preprint arXiv …, 2017 - arxiv.org
Much of the success of single agent deep reinforcement learning (DRL) in recent years can
be attributed to the use of experience replay memories (ERM), which allow Deep Q …

Prospects for multi-agent collaboration and gaming: challenge, technology, and application

Y Liu, Z Li, Z Jiang, Y He - Frontiers of Information Technology & Electronic …, 2022 - Springer
Conclusions In this study, we presented the prospects for multi-agent system research with a
special focus on agent collaboration and gaming tasks. We briefly introduced some open …

Actor-critic policy optimization in partially observable multiagent environments

S Srinivasan, M Lanctot, V Zambaldi… - Advances in neural …, 2018 - proceedings.neurips.cc
Optimization of parameterized policies for reinforcement learning (RL) is an important and
challenging problem in artificial intelligence. Among the most common approaches are …

Autonomous algorithmic collusion: Q‐learning under sequential pricing

T Klein - The RAND Journal of Economics, 2021 - Wiley Online Library
Prices are increasingly set by algorithms. One concern is that intelligent algorithms may
learn to collude on higher prices even in the absence of the kind of coordination necessary …

Policy optimization provably converges to Nash equilibria in zero-sum linear quadratic games

K Zhang, Z Yang, T Basar - Advances in Neural Information …, 2019 - proceedings.neurips.cc
We study the global convergence of policy optimization for finding the Nash equilibria (NE)
in zero-sum linear quadratic (LQ) games. To this end, we first investigate the landscape of …

Event-triggered communication network with limited-bandwidth constraint for multi-agent reinforcement learning

G Hu, Y Zhu, D Zhao, M Zhao… - IEEE Transactions on …, 2021 - ieeexplore.ieee.org
Communicating agents with each other in a distributed manner and behaving as a group are
essential in multi-agent reinforcement learning. However, real-world multi-agent systems …