Effectiveness of large language models in automated evaluation of argumentative essays: finetuning vs. zero-shot prompting

Q Wang, JM Gayed - Computer Assisted Language Learning, 2024 - Taylor & Francis
To address the long-standing challenge facing traditional automated writing evaluation
(AWE) systems in assessing higher-order thinking, this study built an AWE system for …

Can LLMs Reason Like Humans? Assessing Theory of Mind Reasoning in LLMs for Open-Ended Questions

M Amirizaniani, E Martin, M Sivachenko… - Proceedings of the 33rd …, 2024 - dl.acm.org
Theory of mind (ToM) reasoning involves understanding that others have intentions,
emotions, and thoughts, which is crucial for regulating one's reasoning. Although large …

DOMINO: A Dual-System for Multi-step Visual Language Reasoning

P Wang, O Golovneva, A Aghajanyan, X Ren… - arXiv preprint arXiv …, 2023 - arxiv.org
Visual language reasoning requires a system to extract text or numbers from information-
dense images like charts or plots and perform logical or arithmetic reasoning to arrive at an …

ChatGPT Rates Natural Language Explanation Quality Like Humans: But on Which Scales?

F Huang, H Kwak, K Park, J An - arXiv preprint arXiv:2403.17368, 2024 - arxiv.org
As AI becomes more integral in our lives, the need for transparency and responsibility
grows. While natural language explanations (NLEs) are vital for clarifying the reasoning …

Analyzing the Role of Semantic Representations in the Era of Large Language Models

Z Jin, Y Chen, F Gonzalez, J Liu, J Zhang… - arXiv preprint arXiv …, 2024 - arxiv.org
Traditionally, natural language processing (NLP) models often use a rich set of features
created by linguistic expertise, such as semantic representations. However, in the era of …

Comparing the Evaluation and Production of Loophole Behavior in Humans and Large Language Models

S Murthy, K Parece, S Bridgers, P Qian… - Findings of the …, 2023 - aclanthology.org
In law, lore, and everyday life, loopholes are commonplace. When people exploit a
loophole, they understand the intended meaning or goal of another person, but choose to go …

Do LLMs Exhibit Human-Like Reasoning? Evaluating Theory of Mind in LLMs for Open-Ended Responses

M Amirizaniani, E Martin, M Sivachenko… - arXiv preprint arXiv …, 2024 - arxiv.org
Theory of Mind (ToM) reasoning entails recognizing that other individuals possess their own
intentions, emotions, and thoughts, which is vital for guiding one's own thought processes …

Measuring the Robustness of NLP Models to Domain Shifts

N Calderon, N Porat, E Ben-David, A Chapanin… - arXiv preprint arXiv …, 2023 - arxiv.org
Existing research on Domain Robustness (DR) suffers from disparate setups, limited task
variety, and scarce research on recent capabilities such as in-context learning. Furthermore …