Promptbench: Towards evaluating the robustness of large language models on adversarial prompts

Y Chang, X Wang, J Wang, Y Wu, L Yang… - ACM Transactions on …, 2024 - dl.acm.org

Large language models (LLMs) are gaining increasing popularity in both academia and
industry, owing to their unprecedented performance in various applications. As LLMs …

被引用次数：1054 相关文章所有 4 个版本

[HTML] sciencedirect.com

[HTML][HTML] A survey on large language model (llm) security and privacy: The good, the bad, and the ugly

Y Yao, J Duan, K Xu, Y Cai, Z Sun, Y Zhang - High-Confidence Computing, 2024 - Elsevier

Abstract Large Language Models (LLMs), such as ChatGPT and Bard, have revolutionized
natural language understanding and generation. They possess deep language …

被引用次数：192 相关文章所有 11 个版本

[PDF] arxiv.org

A survey of large language models

WX Zhao, K Zhou, J Li, T Tang, X Wang, Y Hou… - arXiv preprint arXiv …, 2023 - arxiv.org

Language is essentially a complex, intricate system of human expressions governed by
grammatical rules. It poses a significant challenge to develop capable AI algorithms for …

被引用次数：2189 相关文章所有 4 个版本

[PDF] arxiv.org

Universal and transferable adversarial attacks on aligned language models

A Zou, Z Wang, N Carlini, M Nasr, JZ Kolter… - arXiv preprint arXiv …, 2023 - arxiv.org

Because" out-of-the-box" large language models are capable of generating a great deal of
objectionable content, recent work has focused on aligning these models in an attempt to …

被引用次数：624 相关文章所有 8 个版本

[PDF] arxiv.org

Siren's song in the AI ocean: a survey on hallucination in large language models

Y Zhang, Y Li, L Cui, D Cai, L Liu, T Fu… - arXiv preprint arXiv …, 2023 - arxiv.org

While large language models (LLMs) have demonstrated remarkable capabilities across a
range of downstream tasks, a significant concern revolves around their propensity to exhibit …

被引用次数：549 相关文章所有 2 个版本

[PDF] github.io

The rise and potential of large language model based agents: A survey

Z Xi, W Chen, X Guo, W He, Y Ding, B Hong… - arXiv preprint arXiv …, 2023 - arxiv.org

For a long time, humanity has pursued artificial intelligence (AI) equivalent to or surpassing
the human level, with AI agents considered a promising vehicle for this pursuit. AI agents are …

被引用次数：422 相关文章所有 4 个版本

[PDF] arxiv.org

Trustworthy LLMs: A survey and guideline for evaluating large language models' alignment

Y Liu, Y Yao, JF Ton, X Zhang, R Guo, H Cheng… - arXiv preprint arXiv …, 2023 - arxiv.org

Ensuring alignment, which refers to making models behave in accordance with human
intentions [1, 2], has become a critical task before deploying large language models (LLMs) …

被引用次数：166 相关文章所有 3 个版本

[PDF] arxiv.org

Is chatgpt a good nlg evaluator? a preliminary study

J Wang, Y Liang, F Meng, Z Sun, H Shi, Z Li… - arXiv preprint arXiv …, 2023 - arxiv.org

Recently, the emergence of ChatGPT has attracted wide attention from the computational
linguistics community. Many prior studies have shown that ChatGPT achieves remarkable …

被引用次数：249 相关文章所有 6 个版本

[PDF] qub.ac.uk

[PDF][PDF] DecodingTrust: A Comprehensive Assessment of Trustworthiness in GPT Models.

B Wang, W Chen, H Pei, C Xie, M Kang, C Zhang, C Xu… - NeurIPS, 2023 - blogs.qub.ac.uk

Abstract Generative Pre-trained Transformer (GPT) models have exhibited exciting progress
in their capabilities, capturing the interest of practitioners and the public alike. Yet, while the …

被引用次数：228 相关文章所有 8 个版本

[PDF] arxiv.org

Trustllm: Trustworthiness in large language models

L Sun, Y Huang, H Wang, S Wu, Q Zhang… - arXiv preprint arXiv …, 2024 - arxiv.org

Large language models (LLMs), exemplified by ChatGPT, have gained considerable
attention for their excellent natural language processing capabilities. Nonetheless, these …

被引用次数：127 相关文章所有 4 个版本

高级搜索

QQ 群