A survey of large language models

WX Zhao, K Zhou, J Li, T Tang, X Wang, Y Hou… - arXiv preprint arXiv …, 2023 - arxiv.org
Language is essentially a complex, intricate system of human expressions governed by
grammatical rules. It poses a significant challenge to develop capable AI algorithms for …

Self-rewarding language models

W Yuan, RY Pang, K Cho, S Sukhbaatar, J Xu… - arXiv preprint arXiv …, 2024 - arxiv.org
We posit that to achieve superhuman agents, future models require superhuman feedback
in order to provide an adequate training signal. Current approaches commonly train reward …

Llm-based nlg evaluation: Current status and challenges

M Gao, X Hu, J Ruan, X Pu, X Wan - arXiv preprint arXiv:2402.01383, 2024 - arxiv.org
Evaluating natural language generation (NLG) is a vital but challenging problem in artificial
intelligence. Traditional evaluation metrics mainly capturing content (eg n-gram) overlap …

Evaluating large language models at evaluating instruction following

Z Zeng, J Yu, T Gao, Y Meng, T Goyal… - arXiv preprint arXiv …, 2023 - arxiv.org
As research in large language models (LLMs) continues to accelerate, LLM-based
evaluation has emerged as a scalable and cost-effective alternative to human evaluations …

Self-discover: Large language models self-compose reasoning structures

P Zhou, J Pujara, X Ren, X Chen, HT Cheng… - arXiv preprint arXiv …, 2024 - arxiv.org
We introduce SELF-DISCOVER, a general framework for LLMs to self-discover the task-
intrinsic reasoning structures to tackle complex reasoning problems that are challenging for …

A comprehensive survey on instruction following

R Lou, K Zhang, W Yin - arXiv preprint arXiv:2303.10475, 2023 - arxiv.org
Task semantics can be expressed by a set of input-output examples or a piece of textual
instruction. Conventional machine learning approaches for natural language processing …

Beyond chatbots: Explorellm for structured thoughts and personalized model responses

X Ma, S Mishra, A Liu, SY Su, J Chen… - Extended Abstracts of …, 2024 - dl.acm.org
Large language model (LLM) powered chatbots are primarily text-based today, and impose
a large interactional cognitive load, especially for exploratory or sensemaking tasks such as …

Topologies of reasoning: Demystifying chains, trees, and graphs of thoughts

M Besta, F Memedi, Z Zhang, R Gerstenberger… - arXiv preprint arXiv …, 2024 - arxiv.org
The field of natural language processing (NLP) has witnessed significant progress in recent
years, with a notable focus on improving large language models'(LLM) performance through …

Distilling System 2 into System 1

P Yu, J Xu, J Weston, I Kulikov - arXiv preprint arXiv:2407.06023, 2024 - arxiv.org
Large language models (LLMs) can spend extra compute during inference to generate
intermediate thoughts, which helps to produce better final responses. Since Chain-of …

Concept--An Evaluation Protocol on Conversation Recommender Systems with System-and User-centric Factors

C Huang, P Qin, Y Deng, W Lei, J Lv… - arXiv preprint arXiv …, 2024 - arxiv.org
The conversational recommendation system (CRS) has been criticized regarding its user
experience in real-world scenarios, despite recent significant progress achieved in …