GPT-4 Surpassing Human Performance in Linguistic Pragmatics

[HTML][HTML] Human-Comparable Sensitivity of Large Language Models in Identifying Eligible Studies Through Title and Abstract Screening: 3-Layer Strategy Using GPT …

K Matsui, T Utsumi, Y Aoki, T Maruki… - Journal of Medical …, 2024 - jmir.org

Background The screening process for systematic reviews is resource-intensive. Although
previous machine learning solutions have reported reductions in workload, they risked …

被引用次数：5 相关文章所有 6 个版本

[PDF] arxiv.org

Gpt-4 vs. human translators: A comprehensive evaluation of translation quality across languages, domains, and expertise levels

J Yan, P Yan, Y Chen, J Li, X Zhu, Y Zhang - arXiv preprint arXiv …, 2024 - arxiv.org

This study comprehensively evaluates the translation quality of Large Language Models
(LLMs), specifically GPT-4, against human translators of varying expertise levels across …

被引用次数：8 相关文章所有 3 个版本

[PDF] arxiv.org

Pragmatic Competence Evaluation of Large Language Models for Korean

D Park, J Lee, H Jeong, S Park, S Lee - arXiv preprint arXiv:2403.12675, 2024 - arxiv.org

The current evaluation of Large Language Models (LLMs) predominantly relies on
benchmarks focusing on their embedded knowledge by testing through multiple-choice …

被引用次数：4 相关文章所有 2 个版本

[PDF] arxiv.org

MultiPragEval: Multilingual Pragmatic Evaluation of Large Language Models

D Park, J Lee, S Park, H Jeong, Y Koo… - arXiv preprint arXiv …, 2024 - arxiv.org

As the capabilities of LLMs expand, it becomes increasingly important to evaluate them
beyond basic knowledge assessment, focusing on higher-level language understanding …

被引用次数：1 相关文章所有 2 个版本

[PDF] arxiv.org

Benchmarking GPT-4 against Human Translators: A Comprehensive Evaluation Across Languages, Domains, and Expertise Levels

J Yan, P Yan, Y Chen, J Li, X Zhu, Y Zhang - arXiv preprint arXiv …, 2024 - arxiv.org

This study presents a comprehensive evaluation of GPT-4's translation capabilities
compared to human translators of varying expertise levels. Through systematic human …

Enhancing LLM Conversational Acuity Using Pragmatic Measures

A Han, N Koushik, N Bidarkundi… - … on Information and …, 2024 - ieeexplore.ieee.org

AI-driven communication has the potential to transform society with superhuman capabilities
such as real-time multilingual translation, predictive text generation, and personalized …

[PDF] researchsquare.com

Using Large Language Models for Automated Grading of Student Writing about Science

C Impey, M Wenger, N Garuda, S Golchin, S Stamer - 2024 - researchsquare.com

A challenge in teaching large classes for formal or informal learners is assessing writing. As
a result, most large classes, especially in science, use objective assessment tools like …

高级搜索

QQ 群