A framework for learning and planning against switching strategies in repeated games

P Hernandez-Leal, E Munoz de Cote… - Connection Science, 2014 - Taylor & Francis
Connection Science, 2014Taylor & Francis
Intelligent agents, human or artificial, often change their behaviour as they interact with other
agents. For an agent to optimise its performance when interacting with such agents, it must
be capable of detecting and adapting according to such changes. This work presents an
approach on how to effectively deal with non-stationary switching opponents in a repeated
game context. Our main contribution is a framework for online learning and planning against
opponents that switch strategies. We present how two opponent modelling techniques work …
Intelligent agents, human or artificial, often change their behaviour as they interact with other agents. For an agent to optimise its performance when interacting with such agents, it must be capable of detecting and adapting according to such changes. This work presents an approach on how to effectively deal with non-stationary switching opponents in a repeated game context. Our main contribution is a framework for online learning and planning against opponents that switch strategies. We present how two opponent modelling techniques work within the framework and prove the usefulness of the approach experimentally in the iterated prisoner's dilemma, when the opponent is modelled as an agent that switches between different strategies (e.g. TFT, Pavlov and Bully). The results of both models were compared against each other and against a state-of-the-art non-stationary reinforcement learning technique. Results reflect that our approach obtains competitive results without needing an offline training phase, as opposed to the state-of-the-art techniques.
Taylor & Francis Online
以上显示的是最相近的搜索结果。 查看全部搜索结果