A comprehensive overview of large language models

H Naveed, AU Khan, S Qiu, M Saqib, S Anwar… - arXiv preprint arXiv …, 2023 - arxiv.org
Large Language Models (LLMs) have recently demonstrated remarkable capabilities in
natural language processing tasks and beyond. This success of LLMs has led to a large …

[HTML][HTML] A survey of GPT-3 family large language models including ChatGPT and GPT-4

KS Kalyan - Natural Language Processing Journal, 2024 - Elsevier
Large language models (LLMs) are a special class of pretrained language models (PLMs)
obtained by scaling model size, pretraining corpus and computation. LLMs, because of their …

Gpt-4 passes the bar exam

DM Katz, MJ Bommarito, S Gao… - … Transactions of the …, 2024 - royalsocietypublishing.org
In this paper, we experimentally evaluate the zero-shot performance of GPT-4 against prior
generations of GPT on the entire uniform bar examination (UBE), including not only the …

Survey on factuality in large language models: Knowledge, retrieval and domain-specificity

C Wang, X Liu, Y Yue, X Tang, T Zhang… - arXiv preprint arXiv …, 2023 - arxiv.org
This survey addresses the crucial issue of factuality in Large Language Models (LLMs). As
LLMs find applications across diverse domains, the reliability and accuracy of their outputs …

Large language models on graphs: A comprehensive survey

B Jin, G Liu, C Han, M Jiang, H Ji… - IEEE Transactions on …, 2024 - ieeexplore.ieee.org
Large language models (LLMs), such as GPT4 and LLaMA, are creating significant
advancements in natural language processing, due to their strong text encoding/decoding …

Large legal fictions: Profiling legal hallucinations in large language models

M Dahl, V Magesh, M Suzgun… - Journal of Legal Analysis, 2024 - academic.oup.com
Do large language models (LLMs) know the law? LLMs are increasingly being used to
augment legal practice, education, and research, yet their revolutionary potential is …

Inadequacies of large language model benchmarks in the era of generative artificial intelligence

TR McIntosh, T Susnjak, N Arachchilage, T Liu… - arXiv preprint arXiv …, 2024 - arxiv.org
The rapid rise in popularity of Large Language Models (LLMs) with emerging capabilities
has spurred public curiosity to evaluate and compare different LLMs, leading many …

Introducing v0. 5 of the ai safety benchmark from mlcommons

B Vidgen, A Agrawal, AM Ahmed, V Akinwande… - arXiv preprint arXiv …, 2024 - arxiv.org
This paper introduces v0. 5 of the AI Safety Benchmark, which has been created by the
MLCommons AI Safety Working Group. The AI Safety Benchmark has been designed to …

Lawbench: Benchmarking legal knowledge of large language models

Z Fei, X Shen, D Zhu, F Zhou, Z Han, S Zhang… - arXiv preprint arXiv …, 2023 - arxiv.org
Large language models (LLMs) have demonstrated strong capabilities in various aspects.
However, when applying them to the highly specialized, safe-critical legal domain, it is …

Expertqa: Expert-curated questions and attributed answers

C Malaviya, S Lee, S Chen, E Sieber, M Yatskar… - arXiv preprint arXiv …, 2023 - arxiv.org
As language models are adapted by a more sophisticated and diverse set of users, the
importance of guaranteeing that they provide factually correct information supported by …