Contextualized Sequence Likelihood: Enhanced Confidence Scores for Natural Language Generation

Z Lin, S Trivedi, J Sun - arXiv preprint arXiv:2406.01806, 2024 - arxiv.org
The advent of large language models (LLMs) has dramatically advanced the state-of-the-art
in numerous natural language generation tasks. For LLMs to be applied reliably, it is …

Counterfactual Debating with Preset Stances for Hallucination Elimination of LLMs

Y Fang, M Li, W Wang, H Lin, F Feng - arXiv preprint arXiv:2406.11514, 2024 - arxiv.org
Large Language Models (LLMs) excel in various natural language processing tasks but
struggle with hallucination issues. Existing solutions have considered utilizing LLMs' …

Enhancing Confidence Expression in Large Language Models Through Learning from Past Experience

H Han, T Li, S Chen, J Shi, C Du, Y Xiao… - arXiv preprint arXiv …, 2024 - arxiv.org
Large Language Models (LLMs) have exhibited remarkable performance across various
downstream tasks, but they may generate inaccurate or false information with a confident …

Cycles of Thought: Measuring LLM Confidence through Stable Explanations

E Becker, S Soatto - arXiv preprint arXiv:2406.03441, 2024 - arxiv.org
In many high-risk machine learning applications it is essential for a model to indicate when it
is uncertain about a prediction. While large language models (LLMs) can reach and even …

Topic Modelling Case Law Using a Large Language Model and a New Taxonomy for UK Law: AI Insights into Summary Judgment

H Sargeant, A Izzidien, F Steffek - arXiv preprint arXiv:2405.12910, 2024 - arxiv.org
This paper addresses a critical gap in legal analytics by developing and applying a novel
taxonomy for topic modelling summary judgment cases in the United Kingdom. Using a …

Harmonic LLMs are Trustworthy

NS Kersting, M Rahman, S Vedala, Y Wang - arXiv preprint arXiv …, 2024 - arxiv.org
We introduce an intuitive method to test the robustness (stability and explainability) of any
black-box LLM in real-time, based upon the local deviation from harmoniticity, denoted as …