S Bailis, J Friedhoff, F Chen - arXiv preprint arXiv:2407.13943, 2024 - arxiv.org
This paper introduces Werewolf Arena, a novel framework for evaluating large language
models (LLMs) through the lens of the classic social deduction game, Werewolf. In Werewolf …