- 学术资源搜索

Inadequacies of large language model benchmarks in the era of generative artificial intelligence

TR McIntosh, T Susnjak, N Arachchilage, T Liu… - arXiv preprint arXiv …, 2024 - arxiv.org

The rapid rise in popularity of Large Language Models (LLMs) with emerging capabilities
has spurred public curiosity to evaluate and compare different LLMs, leading many …

被引用次数：104 相关文章所有 3 个版本

[PDF] neurips.cc

Moca: Measuring human-language model alignment on causal and moral judgment tasks

A Nie, Y Zhang, AS Amdekar, C Piech… - Advances in …, 2023 - proceedings.neurips.cc

Human commonsense understanding of the physical and social world is organized around
intuitive theories. These theories support making causal and moral judgments. When …

被引用次数：31 相关文章所有 8 个版本

[PDF] oup.com Full View

Perils and opportunities in using large language models in psychological research

S Abdurahman, M Atari, F Karimi-Malekabadi… - PNAS …, 2024 - academic.oup.com

The emergence of large language models (LLMs) has sparked considerable interest in their
potential application in psychological research, mainly as a model of the human psyche or …

被引用次数：32 相关文章所有 5 个版本

[PDF] radensa.ru

[PDF][PDF] A comprehensive survey of small language models in the era of large language models: Techniques, enhancements, applications, collaboration with llms, and …

F Wang, Z Zhang, X Zhang, Z Wu, T Mo, Q Lu… - arXiv preprint arXiv …, 2024 - ai.radensa.ru

Large language models (LLM) have demonstrated emergent abilities in text generation,
question answering, and reasoning, facilitating various tasks and domains. Despite their …

被引用次数：3 相关文章所有 3 个版本

[PDF] openreview.net

On the humanity of conversational ai: Evaluating the psychological portrayal of llms

J Huang, W Wang, EJ Li, MH Lam, S Ren… - The Twelfth …, 2023 - openreview.net

Large Language Models (LLMs) have recently showcased their remarkable capacities, not
only in natural language processing tasks but also across diverse domains such as clinical …

被引用次数：33 相关文章

[PDF] arxiv.org

Who is ChatGPT? Benchmarking LLMs' Psychological Portrayal Using PsychoBench

J Huang, W Wang, EJ Li, MH Lam, S Ren… - arXiv preprint arXiv …, 2023 - arxiv.org

Large Language Models (LLMs) have recently showcased their remarkable capacities, not
only in natural language processing tasks but also across diverse domains such as clinical …

被引用次数：26 相关文章所有 3 个版本

[HTML] sciencedirect.com

[HTML][HTML] Surprising gender biases in GPT

RA Fulgu, V Capraro - Computers in Human Behavior Reports, 2024 - Elsevier

We present eight experiments exploring gender biases in GPT. Initially, GPT was asked to
generate demographics of a potential writer of fourty phrases ostensibly written by …

被引用次数：4 相关文章所有 7 个版本

[PDF] arxiv.org

Evaluating cultural adaptability of a large language model via simulation of synthetic personas

L Kwok, M Bravansky, LD Griffin - arXiv preprint arXiv:2408.06929, 2024 - arxiv.org

The success of Large Language Models (LLMs) in multicultural environments hinges on
their ability to understand users' diverse cultural backgrounds. We measure this capability by …

被引用次数：4 相关文章所有 3 个版本

[PDF] arxiv.org

Analyzing nobel prize literature with large language models

Z Yang, Z Liu, J Zhang, C Lu, J Tai, T Zhong… - arXiv preprint arXiv …, 2024 - arxiv.org

This study examines the capabilities of advanced Large Language Models (LLMs),
particularly the o1 model, in the context of literary analysis. The outputs of these models are …

被引用次数：3 相关文章所有 3 个版本

[PDF] arxiv.org

What makes your model a low-empathy or warmth person: Exploring the Origins of Personality in LLMs

S Yang, S Zhu, R Bao, L Liu, Y Cheng, L Hu… - arXiv preprint arXiv …, 2024 - arxiv.org

Large language models (LLMs) have demonstrated remarkable capabilities in generating
human-like text and exhibiting personality traits similar to those in humans. However, the …

被引用次数：3 相关文章所有 4 个版本

高级搜索

QQ 群

Inadequacies of large language model benchmarks in the era of generative artificial intelligence

Moca: Measuring human-language model alignment on causal and moral judgment tasks

Perils and opportunities in using large language models in psychological research

[PDF][PDF] A comprehensive survey of small language models in the era of large language models: Techniques, enhancements, applications, collaboration with llms, and …

On the humanity of conversational ai: Evaluating the psychological portrayal of llms

Who is ChatGPT? Benchmarking LLMs' Psychological Portrayal Using PsychoBench

[HTML][HTML] Surprising gender biases in GPT

Evaluating cultural adaptability of a large language model via simulation of synthetic personas

Analyzing nobel prize literature with large language models

What makes your model a low-empathy or warmth person: Exploring the Origins of Personality in LLMs

引用