Towards best practices in AGI safety and governance: A survey of expert opinion

D Hendrycks, M Mazeika, T Woodside - arXiv preprint arXiv:2306.12001, 2023 - arxiv.org

Rapid advancements in artificial intelligence (AI) have sparked growing concerns among
experts, policymakers, and world leaders regarding the potential for increasingly advanced …

被引用次数：91 相关文章所有 4 个版本

[PDF] arxiv.org

Ai alignment: A comprehensive survey

J Ji, T Qiu, B Chen, B Zhang, H Lou, K Wang… - arXiv preprint arXiv …, 2023 - arxiv.org

AI alignment aims to make AI systems behave in line with human intentions and values. As
AI systems grow more capable, the potential large-scale risks associated with misaligned AI …

被引用次数：82 相关文章所有 3 个版本

[PDF] arxiv.org

Can llm-generated misinformation be detected?

C Chen, K Shu - arXiv preprint arXiv:2309.13788, 2023 - arxiv.org

The advent of Large Language Models (LLMs) has made a transformative impact. However,
the potential that LLMs such as ChatGPT can be exploited to generate misinformation has …

被引用次数：66 相关文章所有 6 个版本

[PDF] arxiv.org

From google gemini to openai q*(q-star): A survey of reshaping the generative artificial intelligence (ai) research landscape

TR McIntosh, T Susnjak, T Liu, P Watters… - arXiv preprint arXiv …, 2023 - arxiv.org

This comprehensive survey explored the evolving landscape of generative Artificial
Intelligence (AI), with a specific focus on the transformative impacts of Mixture of Experts …

被引用次数：59 相关文章所有 3 个版本

[PDF] arxiv.org

Managing AI risks in an era of rapid progress

Y Bengio, G Hinton, A Yao, D Song, P Abbeel… - arXiv preprint arXiv …, 2023 - arxiv.org

In this short consensus paper, we outline risks from upcoming, advanced AI systems. We
examine large-scale social harms and malicious uses, as well as an irreversible loss of …

被引用次数：46 相关文章所有 14 个版本

[HTML] science.org

Managing extreme AI risks amid rapid progress

Y Bengio, G Hinton, A Yao, D Song, P Abbeel, T Darrell… - Science, 2024 - science.org

Artificial intelligence (AI) is progressing rapidly, and companies are shifting their focus to
developing generalist AI systems that can autonomously act and pursue goals. Increases in …

被引用次数：8 相关文章所有 5 个版本

[PDF] arxiv.org

Look before you leap: An exploratory study of uncertainty measurement for large language models

Y Huang, J Song, Z Wang, H Chen, L Ma - arXiv preprint arXiv:2307.10236, 2023 - arxiv.org

The recent performance leap of Large Language Models (LLMs) opens up new
opportunities across numerous industrial applications and domains. However, erroneous …

被引用次数：39 相关文章所有 4 个版本

[PDF] acm.org

Black-box access is insufficient for rigorous ai audits

S Casper, C Ezell, C Siegmann, N Kolt… - The 2024 ACM …, 2024 - dl.acm.org

External audits of AI systems are increasingly recognized as a key mechanism for AI
governance. The effectiveness of an audit, however, depends on the degree of access …

被引用次数：17 相关文章所有 4 个版本

[PDF] arxiv.org

Foundational challenges in assuring alignment and safety of large language models

U Anwar, A Saparov, J Rando, D Paleka… - arXiv preprint arXiv …, 2024 - arxiv.org

This work identifies 18 foundational challenges in assuring the alignment and safety of large
language models (LLMs). These challenges are organized into three different categories …

被引用次数：23 相关文章所有 3 个版本

[PDF] arxiv.org

Red-Teaming for Generative AI: Silver Bullet or Security Theater?

M Feffer, A Sinha, ZC Lipton, H Heidari - arXiv preprint arXiv:2401.15897, 2024 - arxiv.org

In response to rising concerns surrounding the safety, security, and trustworthiness of
Generative AI (GenAI) models, practitioners and regulators alike have pointed to AI red …

被引用次数：8 相关文章所有 2 个版本

高级搜索

QQ 群