A Survey on LLM-as-a-Judge

J Gu, X Jiang, Z Shi, H Tan, X Zhai, C Xu, W Li… - arXiv preprint arXiv …, 2024 - arxiv.org
Accurate and consistent evaluation is crucial for decision-making across numerous fields,
yet it remains a challenging task due to inherent subjectivity, variability, and scale. Large …

Eureka: Evaluating and understanding large foundation models

V Balachandran, J Chen, N Joshi, B Nushi… - arXiv preprint arXiv …, 2024 - arxiv.org
Rigorous and reproducible evaluation is critical for assessing the state of the art and for
guiding scientific advances in Artificial Intelligence. Evaluation is challenging in practice due …

Do llms plan like human writers? comparing journalist coverage of press releases with llms

A Spangher, N Peng, S Gehrmann… - Proceedings of the 2024 …, 2024 - aclanthology.org
Journalists engage in multiple steps in the news writing process that depend on human
creativity, like exploring different “angles”(ie the specific perspectives a reporter takes) …

A smart mnemonic sounds like" glue tonic": Mixing llms with student feedback to make mnemonic learning stick

N Balepur, M Shu, A Hoyle, A Robey, S Feng… - arXiv preprint arXiv …, 2024 - arxiv.org
Keyword mnemonics are memorable explanations that link new terms to simpler keywords.
Prior work generates mnemonics for students, but they do not train models using mnemonics …

A survey on large language model hallucination via a creativity perspective

X Jiang, Y Tian, F Hua, C Xu, Y Wang, J Guo - arXiv preprint arXiv …, 2024 - arxiv.org
Hallucinations in large language models (LLMs) are always seen as limitations. However,
could they also be a source of creativity? This survey explores this possibility, suggesting …

Characterising the Creative Process in Humans and Large Language Models

SS Nath, P Dayan, C Stevenson - arXiv preprint arXiv:2405.00899, 2024 - arxiv.org
Large language models appear quite creative, often performing on par with the average
human on creative tasks. However, research on LLM creativity has focused solely on\textit …

Llmphy: Complex physical reasoning using large language models and world models

A Cherian, R Corcodel, S Jain, D Romeres - arXiv preprint arXiv …, 2024 - arxiv.org
Physical reasoning is an important skill needed for robotic agents when operating in the real
world. However, solving such reasoning problems often involves hypothesizing and …

Narrative Creativity: An Introduction to How and Why

A Fletcher, M Benveniste - Elements in Creativity and Imagination, 2025 - cambridge.org
Narrative creativity is a new, neuroscience-based approach to innovation, problem solving,
and resilience that has proved effective in business executives, scientists, engineers …

Large Language Models show both individual and collective creativity comparable to humans

L Sun, Y Yuan, Y Yao, Y Li, H Zhang, X Xie… - arXiv preprint arXiv …, 2024 - arxiv.org
Artificial intelligence has, so far, largely automated routine tasks, but what does it mean for
the future of work if Large Language Models (LLMs) show creativity comparable to humans …

VIVA: A Benchmark for Vision-Grounded Decision-Making with Human Values

Z Hu, Y Ren, J Li, Y Yin - arXiv preprint arXiv:2407.03000, 2024 - arxiv.org
Large vision language models (VLMs) have demonstrated significant potential for
integration into daily life, making it crucial for them to incorporate human values when …