Capturing failures of large language models via human cognitive biases

X Hou, Y Zhao, Y Liu, Z Yang, K Wang, L Li… - ACM Transactions on …, 2024 - dl.acm.org

Large Language Models (LLMs) have significantly impacted numerous domains, including
Software Engineering (SE). Many recent publications have explored LLMs applied to …

被引用次数：489 相关文章所有 8 个版本

[PDF] mlr.press

Large language models can be easily distracted by irrelevant context

F Shi, X Chen, K Misra, N Scales… - International …, 2023 - proceedings.mlr.press

Large language models have achieved impressive performance on various natural
language processing tasks. However, so far they have been evaluated primarily on …

被引用次数：374 相关文章所有 6 个版本

[PDF] mlr.press

Using large language models to simulate multiple humans and replicate human subject studies

GV Aher, RI Arriaga, AT Kalai - International Conference on …, 2023 - proceedings.mlr.press

We introduce a new type of test, called a Turing Experiment (TE), for evaluating to what
extent a given language model, such as GPT models, can simulate different aspects of …

被引用次数：440 相关文章所有 7 个版本

[PDF] arxiv.org

Large language models and cognitive science: A comprehensive review of similarities, differences, and challenges

Q Niu, J Liu, Z Bi, P Feng, B Peng, K Chen, M Li… - arXiv preprint arXiv …, 2024 - arxiv.org

This comprehensive review explores the intersection of Large Language Models (LLMs) and
cognitive science, examining similarities and differences between LLMs and human …

被引用次数：9 相关文章所有 2 个版本

[HTML] jmir.org

[HTML][HTML] The role of large language models in medical education: applications and implications

CW Safranek, AE Sidamon-Eristoff, A Gilson… - JMIR medical …, 2023 - mededu.jmir.org

Large language models (LLMs) such as ChatGPT have sparked extensive discourse within
the medical education community, spurring both excitement and apprehension. Written from …

被引用次数：116 相关文章所有 6 个版本

[PDF] oup.com Full View

Large legal fictions: Profiling legal hallucinations in large language models

M Dahl, V Magesh, M Suzgun… - Journal of Legal Analysis, 2024 - academic.oup.com

Do large language models (LLMs) know the law? LLMs are increasingly being used to
augment legal practice, education, and research, yet their revolutionary potential is …

被引用次数：87 相关文章所有 5 个版本

[PDF] arxiv.org

Foundational challenges in assuring alignment and safety of large language models

U Anwar, A Saparov, J Rando, D Paleka… - arXiv preprint arXiv …, 2024 - arxiv.org

This work identifies 18 foundational challenges in assuring the alignment and safety of large
language models (LLMs). These challenges are organized into three different categories …

被引用次数：115 相关文章所有 3 个版本

[PDF] arxiv.org

Automated repair of programs from large language models

Z Fan, X Gao, M Mirchev… - 2023 IEEE/ACM 45th …, 2023 - ieeexplore.ieee.org

Large language models such as Codex, have shown the capability to produce code for
many programming tasks. However, the success rate of existing models is low, especially for …

被引用次数：243 相关文章所有 9 个版本

[PDF] mlr.press

Automatically auditing large language models via discrete optimization

E Jones, A Dragan, A Raghunathan… - International …, 2023 - proceedings.mlr.press

Auditing large language models for unexpected behaviors is critical to preempt catastrophic
deployments, yet remains challenging. In this work, we cast auditing as an optimization …

被引用次数：147 相关文章所有 7 个版本

[PDF] mit.edu

Do llms exhibit human-like response biases? a case study in survey design

L Tjuatja, V Chen, T Wu, A Talwalkwar… - Transactions of the …, 2024 - direct.mit.edu

One widely cited barrier to the adoption of LLMs as proxies for humans in subjective tasks is
their sensitivity to prompt wording—but interestingly, humans also display sensitivities to …

被引用次数：54 相关文章所有 5 个版本

高级搜索

QQ 群