Self-checker: Plug-and-play modules for fact-checking with large language models

M Li, B Peng, M Galley, J Gao, Z Zhang - arXiv preprint arXiv:2305.14623, 2023 - arxiv.org
Fact-checking is an essential task in NLP that is commonly utilized for validating the factual
accuracy of claims. Prior work has mainly focused on fine-tuning pre-trained languages …

Factuality of large language models in the year 2024

Y Wang, M Wang, MA Manzoor, F Liu… - arXiv preprint arXiv …, 2024 - arxiv.org
Large language models (LLMs), especially when instruction-tuned for chat, have become
part of our daily lives, freeing people from the process of searching, extracting, and …

VERISCORE: Evaluating the factuality of verifiable claims in long-form text generation

Y Song, Y Kim, M Iyyer - arXiv preprint arXiv:2406.19276, 2024 - arxiv.org
Existing metrics for evaluating the factuality of long-form text, such as FACTSCORE (Min et
al., 2023) and SAFE (Wei et al., 2024), decompose an input text into" atomic claims" and …

[PDF][PDF] Multi-fact: Assessing multilingual llms' multi-regional knowledge using factscore

S Shafayat, E Kim, J Oh, A Oh - arXiv preprint arXiv …, 2024 - globalaicultures.github.io
Abstract Large Language Models (LLMs) are prone to factuality hallucination, generating
text that contradicts established knowledge. While extensive research has addressed this in …

Automated justification production for claim veracity in fact checking: A survey on architectures and approaches

I Eldifrawi, S Wang, A Trabelsi - arXiv preprint arXiv:2407.12853, 2024 - arxiv.org
Automated Fact-Checking (AFC) is the automated verification of claim accuracy. AFC is
crucial in discerning truth from misinformation, especially given the huge amounts of content …

Factuality of large language models: A survey

Y Wang, M Wang, MA Manzoor, F Liu… - Proceedings of the …, 2024 - aclanthology.org
Large language models (LLMs), especially when instruction-tuned for chat, have become
part of our daily lives, freeing people from the process of searching, extracting, and …

Benchmarks as microscopes: A call for model metrology

M Saxon, A Holtzman, P West, WY Wang… - arXiv preprint arXiv …, 2024 - arxiv.org
Modern language models (LMs) pose a new challenge in capability assessment. Static
benchmarks inevitably saturate without providing confidence in the deployment tolerances …

Molecular facts: Desiderata for decontextualization in llm fact verification

A Gunjal, G Durrett - arXiv preprint arXiv:2406.20079, 2024 - arxiv.org
Automatic factuality verification of large language model (LLM) generations is becoming
more and more widely used to combat hallucinations. A major point of tension in the …

Benchmarking Uncertainty Quantification Methods for Large Language Models with LM-Polygraph

R Vashurin, E Fadeeva, A Vazhentsev… - arXiv preprint arXiv …, 2024 - arxiv.org
Uncertainty quantification (UQ) is a critical component of machine learning (ML)
applications. The rapid proliferation of large language models (LLMs) has stimulated …

A survey of ai-generated text forensic systems: Detection, attribution, and characterization

T Kumarage, G Agrawal, P Sheth, R Moraffah… - arXiv preprint arXiv …, 2024 - arxiv.org
We have witnessed lately a rapid proliferation of advanced Large Language Models (LLMs)
capable of generating high-quality text. While these LLMs have revolutionized text …