A Survey of Confidence Estimation and Calibration in Large Language Models

J Geng, F Cai, Y Wang, H Koeppl… - Proceedings of the …, 2024 - aclanthology.org
Large language models (LLMs) have demonstrated remarkable capabilities across a wide
range of tasks in various domains. Despite their impressive performance, they can be …

Simpo: Simple preference optimization with a reference-free reward

Y Meng, M Xia, D Chen - arXiv preprint arXiv:2405.14734, 2024 - arxiv.org
Direct Preference Optimization (DPO) is a widely used offline preference optimization
algorithm that reparameterizes reward functions in reinforcement learning from human …

Alleviating hallucinations of large language models through induced hallucinations

Y Zhang, L Cui, W Bi, S Shi - arXiv preprint arXiv:2312.15710, 2023 - arxiv.org
Despite their impressive capabilities, large language models (LLMs) have been observed to
generate responses that include inaccurate or fabricated information, a phenomenon …

Unfamiliar finetuning examples control how language models hallucinate

K Kang, E Wallace, C Tomlin, A Kumar… - arXiv preprint arXiv …, 2024 - arxiv.org
Large language models are known to hallucinate when faced with unfamiliar queries, but
the underlying mechanism that govern how models hallucinate are not yet fully understood …

Rlaif-v: Aligning mllms through open-source ai feedback for super gpt-4v trustworthiness

T Yu, H Zhang, Y Yao, Y Dang, D Chen, X Lu… - arXiv preprint arXiv …, 2024 - arxiv.org
Learning from feedback reduces the hallucination of multimodal large language models
(MLLMs) by aligning them with human preferences. While traditional methods rely on labor …

Think twice before assure: Confidence estimation for large language models through reflection on multiple answers

M Li, W Wang, F Feng, F Zhu, Q Wang… - arXiv preprint arXiv …, 2024 - arxiv.org
Confidence estimation aiming to evaluate output trustability is crucial for the application of
large language models (LLM), especially the black-box ones. Existing confidence estimation …

Calibrated self-rewarding vision language models

Y Zhou, Z Fan, D Cheng, S Yang, Z Chen, C Cui… - arXiv preprint arXiv …, 2024 - arxiv.org
Large Vision-Language Models (LVLMs) have made substantial progress by integrating pre-
trained large language models (LLMs) and vision models through instruction tuning. Despite …

How easy is it to fool your multimodal llms? an empirical analysis on deceptive prompts

Y Qian, H Zhang, Y Yang, Z Gan - arXiv preprint arXiv:2402.13220, 2024 - arxiv.org
The remarkable advancements in Multimodal Large Language Models (MLLMs) have not
rendered them immune to challenges, particularly in the context of handling deceptive …

Relic: Investigating large language model responses using self-consistency

F Cheng, V Zouhar, S Arora, M Sachan… - Proceedings of the CHI …, 2024 - dl.acm.org
Large Language Models (LLMs) are notorious for blending fact with fiction and generating
non-factual content, known as hallucinations. To address this challenge, we propose an …

Does Fine-Tuning LLMs on New Knowledge Encourage Hallucinations?

Z Gekhman, G Yona, R Aharoni, M Eyal… - arXiv preprint arXiv …, 2024 - arxiv.org
When large language models are aligned via supervised fine-tuning, they may encounter
new factual information that was not acquired through pre-training. It is often conjectured that …