A survey of learning in multiagent environments: Dealing with non-stationarity

P Hernandez-Leal, M Kaisers, T Baarslag… - arXiv preprint arXiv …, 2017 - arxiv.org
The key challenge in multiagent learning is learning a best response to the behaviour of
other agents, which may be non-stationary: if the other agents adapt their strategy as well …

面向多智能体博弈对抗的对手建模框架

罗俊仁, 张万鹏, 袁唯淋, 胡振震, 陈少飞… - 系统仿真学报, 2022 - china-simulation.com
对手建模作为多智能体博弈对抗的关键技术, 是一种典型的智能体认知行为建模方法.
介绍了多智能体博弈对抗几类典型模型, 非平稳问题和元博弈相关理论; 梳理总结对手建模方法 …

Efficiently detecting switches against non-stationary opponents

P Hernandez-Leal, Y Zhan, ME Taylor… - Autonomous Agents and …, 2017 - Springer
Interactions in multiagent systems are generally more complicated than single agent ones.
Game theory provides solutions on how to act in multiagent scenarios; however, it assumes …

Research on opponent modeling framework for multi-agent game confrontation

J Luo, W Zhang, W Yuan, Z Hu… - Journal of …, 2022 - dc-china-simulation …
As the key technology of multi-agent game confrontation, opponent modeling is a typical
cognitive modeling method of agent's behavior. Several typical models of multi-agent game …

Efficient policy detecting and reusing for non-stationarity in markov games

Y Zheng, J Hao, Z Zhang, Z Meng, T Yang, Y Li… - Autonomous Agents and …, 2021 - Springer
One challenging problem in multiagent systems is to cooperate or compete with non-
stationary agents that change behavior from time to time. An agent in such a non-stationary …

Towards a fast detection of opponents in repeated stochastic games

P Hernandez-Leal, M Kaisers - … 2017 Workshops, Best Papers, São Paulo …, 2017 - Springer
Multi-agent algorithms aim to find the best response in strategic interactions. While many
state-of-the-art algorithms assume repeated interaction with a fixed set of opponents (or …

[PDF][PDF] Identifying and tracking switching, non-stationary opponents: A Bayesian approach

P Hernandez-Leal, ME Taylor, BS Rosman, LE Sucar… - 2016 - cdn.aaai.org
In many situations, agents are required to use a set of strategies (behaviors) and switch
among them during the course of an interaction. This work focuses on the problem of …

[HTML][HTML] An online learning algorithm to play discounted repeated games in wireless networks

J Parras, PA Apellániz, S Zazo - Engineering Applications of Artificial …, 2022 - Elsevier
Discounted repeated games are currently being used to model the conflicts that arise
between the nodes in a wireless network, such as distributed resource allocation …

An exploration strategy for non-stationary opponents

P Hernandez-Leal, Y Zhan, ME Taylor… - Autonomous Agents and …, 2017 - Springer
The success or failure of any learning algorithm is partially due to the exploration strategy it
exerts. However, most exploration strategies assume that the environment is stationary and …

Learning adversarial policy in multiple scenes environment via multi-agent reinforcement learning

Y Li, X Wang, W Wang, Z Zhang, J Wang… - Connection …, 2021 - Taylor & Francis
Learning adversarial policy aims to learn behavioural strategies for agents with different
goals, is one of the most significant tasks in multi-agent systems. Multi-agent reinforcement …