Survey of self-play in reinforcement learning

A DiGiovanni, EC Zell - arXiv preprint arXiv:2107.02850, 2021 - arxiv.org
In reinforcement learning (RL), the term self-play describes a kind of multi-agent learning
(MAL) that deploys an algorithm against copies of itself to test compatibility in various …

Thompson sampling for markov games with piecewise stationary opponent policies

A DiGiovanni, A Tewari - Uncertainty in Artificial Intelligence, 2021 - proceedings.mlr.press
Reinforcement learning problems with multiple agents pose the challenge of efficiently
adapting to nonstationary dynamics arising from other agents' strategic behavior. Although …

Risk-Aware Multi-Agent Multi-Armed Bandits

Q Shao, J Ye, JCS Lui - Proceedings of the Twenty-fifth International …, 2024 - dl.acm.org
Multi-armed bandits (MAB) is an online learning and decisionmaking model under
uncertainty. Instead of maximizing the expected utility (or reward) in a classical MAB setting …

A Fairness-Driven Method for Learning Human-Compatible Negotiation Strategies

R Shea, Z Yu - arXiv preprint arXiv:2409.18335, 2024 - arxiv.org
Despite recent advancements in AI and NLP, negotiation remains a difficult domain for AI
agents. Traditional game theoretic approaches that have worked well for two-player zero …

Balancing adaptability and non-exploitability in repeated games

A DiGiovanni, A Tewari - Uncertainty in Artificial Intelligence, 2022 - proceedings.mlr.press
We study the problem of adaptability in repeated games: simultaneously guaranteeing low
regret for several classes of opponents. We add the constraint that our algorithm is non …

[图书][B] Towards Optimal Algorithms For Online Decision Making Under Practical Constraints

ACY Tossou - 2019 - core.ac.uk
Artificial Intelligence is increasingly being used in real-life applications such as driving with
autonomous cars; deliveries with autonomous drones; customer support with chat-bots; …