Collaborating with humans without human data

DJ Strouse, K McKee, M Botvinick… - Advances in …, 2021 - proceedings.neurips.cc
Collaborating with humans requires rapidly adapting to their individual strengths,
weaknesses, and preferences. Unfortunately, most standard multi-agent reinforcement …

Maximum entropy population-based training for zero-shot human-ai coordination

R Zhao, J Song, Y Yuan, H Hu, Y Gao, Y Wu… - Proceedings of the …, 2023 - ojs.aaai.org
We study the problem of training a Reinforcement Learning (RL) agent that is collaborative
with humans without using human data. Although such agents can be obtained through self …

Learning zero-shot cooperation with humans, assuming humans are biased

C Yu, J Gao, W Liu, B Xu, H Tang, J Yang… - arXiv preprint arXiv …, 2023 - arxiv.org
There is a recent trend of applying multi-agent reinforcement learning (MARL) to train an
agent that can cooperate with humans in a zero-shot fashion without using any human data …

Cooperative open-ended learning framework for zero-shot coordination

Y Li, S Zhang, J Sun, Y Du, Y Wen… - International …, 2023 - proceedings.mlr.press
Zero-shot coordination in cooperative artificial intelligence (AI) remains a significant
challenge, which means effectively coordinating with a wide range of unseen partners …

Semantically aligned task decomposition in multi-agent reinforcement learning

W Li, D Qiao, B Wang, X Wang, B Jin, H Zha - arXiv preprint arXiv …, 2023 - arxiv.org
The difficulty of appropriately assigning credit is particularly heightened in cooperative
MARL with sparse reward, due to the concurrent time and structural scales involved …

Tackling cooperative incompatibility for zero-shot human-ai coordination

Y Li, S Zhang, J Sun, W Zhang, Y Du, Y Wen… - Journal of Artificial …, 2024 - jair.org
Securing coordination between AI agent and teammates (human players or AI agents) in
contexts involving unfamiliar humans continues to pose a significant challenge in Zero-Shot …

Quantifying the effects of environment and population diversity in multi-agent reinforcement learning

KR McKee, JZ Leibo, C Beattie, R Everett - Autonomous Agents and Multi …, 2022 - Springer
Generalization is a major challenge for multi-agent reinforcement learning. How well does
an agent perform when placed in novel environments and in interactions with new co …

The boltzmann policy distribution: Accounting for systematic suboptimality in human models

C Laidlaw, A Dragan - arXiv preprint arXiv:2204.10759, 2022 - arxiv.org
Models of human behavior for prediction and collaboration tend to fall into two categories:
ones that learn from large amounts of data via imitation learning, and ones that assume …

Coach: Cooperative robot teaching

C Yu, Y Xu, L Li, D Hsu - Conference on Robot Learning, 2023 - proceedings.mlr.press
Abstract Knowledge and skills can transfer from human teachers to human students.
However, such direct transfer is often not scalable for physical tasks, as they require one-to …

Human-ai shared control via policy dissection

Q Li, Z Peng, H Wu, L Feng… - Advances in Neural …, 2022 - proceedings.neurips.cc
Human-AI shared control allows human to interact and collaborate with autonomous agents
to accomplish control tasks in complex environments. Previous Reinforcement Learning …