A parametric, resource-bounded generalization of Löb’s theorem, and a robust cooperation...

R Ngo, L Chan, S Mindermann - arXiv preprint arXiv:2209.00626, 2022 - arxiv.org

In coming decades, artificial general intelligence (AGI) may surpass human capabilities at
many critical tasks. We argue that, without substantial effort to prevent it, AGIs could learn to …

被引用次数：184 相关文章所有 4 个版本

[PDF] arxiv.org

Open problems in cooperative ai

A Dafoe, E Hughes, Y Bachrach, T Collins… - arXiv preprint arXiv …, 2020 - arxiv.org

Problems of cooperation--in which agents seek ways to jointly improve their welfare--are
ubiquitous and important. They can be found at scales ranging from our daily routines--such …

被引用次数：251 相关文章所有 3 个版本

[PDF] aaai.org

Foundations of cooperative AI

V Conitzer, C Oesterheld - Proceedings of the AAAI Conference on …, 2023 - ojs.aaai.org

AI systems can interact in unexpected ways, sometimes with disastrous consequences. As
AI gets to control more of our world, these interactions will become more common and have …

被引用次数：32 相关文章所有 6 个版本

[PDF] arxiv.org

AI research considerations for human existential safety (ARCHES)

A Critch, D Krueger - arXiv preprint arXiv:2006.04948, 2020 - arxiv.org

Framed in positive terms, this report examines how technical AI research might be steered in
a manner that is more attentive to humanity's long-term prospects for survival as a species …

被引用次数：60 相关文章所有 3 个版本

[PDF] arxiv.org

Game theory with simulation of other players

V Kovarik, C Oesterheld, V Conitzer - arXiv preprint arXiv:2305.11261, 2023 - arxiv.org

Game-theoretic interactions with AI agents could differ from traditional human-human
interactions in various ways. One such difference is that it may be possible to simulate an AI …

被引用次数：13 相关文章所有 6 个版本

[PDF] arxiv.org

Cooperative and uncooperative institution designs: Surprises and problems in open-source game theory

A Critch, M Dennis, S Russell - arXiv preprint arXiv:2208.07006, 2022 - arxiv.org

It is increasingly possible for real-world agents, such as software-based agents or human
institutions, to view the internal programming of other such agents that they interact with. For …

被引用次数：7 相关文章所有 2 个版本

[PDF] neurips.cc

Similarity-based cooperative equilibrium

C Oesterheld, J Treutlein, RB Grosse… - Advances in …, 2024 - proceedings.neurips.cc

As machine learning agents act more autonomously in the world, they will increasingly
interact with each other. Unfortunately, in many social dilemmas like the one-shot Prisoner's …

被引用次数：4 相关文章所有 5 个版本

[HTML] longtermrisk.org

[HTML][HTML] Cooperation, conflict, and transformative artificial intelligence: A research agenda

J Clifton - Effective Altruism Foundation, March, 2020 - longtermrisk.org

The Center on Long-Term Risk's research agenda on Cooperation, Conflict, and
Transformative Artificial Intelligence outlines what we think are the most promising avenues …

被引用次数：7 相关文章

[PDF] academia.edu

[PDF][PDF] Similarity-based cooperation

C Oesterheld, J Treutlein, R Grosse… - arXiv preprint arXiv …, 2022 - academia.edu

As machine learning agents act more autonomously in the world, they will increasingly
interact with each other. Unfortunately, in many social dilemmas like the one-shot Prisoner's …

被引用次数：5 相关文章所有 2 个版本

[PDF] arxiv.org

White-box adversarial policies in deep reinforcement learning

S Casper, T Killian, G Kreiman… - arXiv preprint arXiv …, 2022 - arxiv.org

In reinforcement learning (RL), adversarial policies can be developed by training an
adversarial agent to minimize a target agent's rewards. Prior work has studied black-box …

被引用次数：5 相关文章所有 2 个版本

高级搜索

QQ 群