Survey on factuality in large language models: Knowledge, retrieval and domain-specificity

C Wang, X Liu, Y Yue, X Tang, T Zhang… - arXiv preprint arXiv …, 2023 - arxiv.org
This survey addresses the crucial issue of factuality in Large Language Models (LLMs). As
LLMs find applications across diverse domains, the reliability and accuracy of their outputs …

Cognitive mirage: A review of hallucinations in large language models

H Ye, T Liu, A Zhang, W Hua, W Jia - arXiv preprint arXiv:2309.06794, 2023 - arxiv.org
As large language models continue to develop in the field of AI, text generation systems are
susceptible to a worrisome phenomenon known as hallucination. In this study, we …

Trustllm: Trustworthiness in large language models

Y Huang, L Sun, H Wang, S Wu, Q Zhang, Y Li… - arXiv preprint arXiv …, 2024 - arxiv.org
Large language models (LLMs), exemplified by ChatGPT, have gained considerable
attention for their excellent natural language processing capabilities. Nonetheless, these …

[HTML][HTML] Position: TrustLLM: Trustworthiness in large language models

Y Huang, L Sun, H Wang, S Wu… - International …, 2024 - proceedings.mlr.press
Large language models (LLMs) have gained considerable attention for their excellent
natural language processing capabilities. Nonetheless, these LLMs present many …

Lm vs lm: Detecting factual errors via cross examination

R Cohen, M Hamri, M Geva, A Globerson - arXiv preprint arXiv …, 2023 - arxiv.org
A prominent weakness of modern language models (LMs) is their tendency to generate
factually incorrect text, which hinders their usability. A natural question is whether such …

Evaluating large language models: A comprehensive survey

Z Guo, R Jin, C Liu, Y Huang, D Shi, L Yu, Y Liu… - arXiv preprint arXiv …, 2023 - arxiv.org
Large language models (LLMs) have demonstrated remarkable capabilities across a broad
spectrum of tasks. They have attracted significant attention and been deployed in numerous …

Halo: Estimation and reduction of hallucinations in open-source weak large language models

M Elaraby, M Lu, J Dunn, X Zhang, Y Wang… - arXiv preprint arXiv …, 2023 - arxiv.org
Large Language Models (LLMs) have revolutionized Natural Language Processing (NLP).
Although convenient for research and practical applications, open-source LLMs with fewer …

Generating benchmarks for factuality evaluation of language models

D Muhlgay, O Ram, I Magar, Y Levine, N Ratner… - arXiv preprint arXiv …, 2023 - arxiv.org
Before deploying a language model (LM) within a given domain, it is important to measure
its tendency to generate factually incorrect information in that domain. Existing factual …

Consistency analysis of chatgpt

ME Jang, T Lukasiewicz - arXiv preprint arXiv:2303.06273, 2023 - arxiv.org
ChatGPT has gained a huge popularity since its introduction. Its positive aspects have been
reported through many media platforms, and some analyses even showed that ChatGPT …

Risk taxonomy, mitigation, and assessment benchmarks of large language model systems

T Cui, Y Wang, C Fu, Y Xiao, S Li, X Deng, Y Liu… - arXiv preprint arXiv …, 2024 - arxiv.org
Large language models (LLMs) have strong capabilities in solving diverse natural language
processing tasks. However, the safety and security issues of LLM systems have become the …