Look before you leap: An exploratory study of uncertainty measurement for large language models

Y Huang, J Song, Z Wang, S Zhao, H Chen… - arXiv preprint arXiv …, 2023 - arxiv.org
The recent performance leap of Large Language Models (LLMs) opens up new
opportunities across numerous industrial applications and domains. However, erroneous …

[HTML][HTML] Bias and Ethics of AI Systems Applied in Auditing-A Systematic Review

W Murikah, JK Nthenge, FM Musyoka - Scientific African, 2024 - Elsevier
The integration of artificial intelligence into auditing shows great potential in enhancing
automation and gaining insights from complex data. However, it also presents significant …

Active Testing of Large Language Model via Multi-Stage Sampling

Y Huang, J Song, Q Hu, F Juefei-Xu, L Ma - arXiv preprint arXiv …, 2024 - arxiv.org
Performance evaluation plays a crucial role in the development life cycle of large language
models (LLMs). It estimates the model's capability, elucidates behavior characteristics, and …

Trustworthiness Assurance Assessment for High-Risk AI-Based Systems

G Stettinger, P Weissensteiner, S Khastgir - IEEE Access, 2024 - ieeexplore.ieee.org
This work proposes methodologies for ensuring the trustworthiness of high-risk artificial
intelligence (AI) systems (AIS) to achieve compliance with the European Union's (EU) AI Act …

Look Before You Leap: An Exploratory Study of Uncertainty Analysis for Large Language Models

Y Huang, J Song, Z Wang, S Zhao… - IEEE Transactions …, 2025 - ieeexplore.ieee.org
The recent performance leap of Large Language Models (LLMs) opens up new
opportunities across numerous industrial applications and domains. However, the potential …

Hallucination Detection in LLMs: Using Bayesian Neural Network Ensembling

GY Arteaga - 2024 - diva-portal.org
Large language models often hallucinate, producing outputs that appear plausible but lack
factual support or deviate from given instructions. This thesis investigates methods for …

Applications of Certainty Scoring for Machine Learning Classification and Out-of-Distribution Detection

AM Berenbeim, AD Cobb, A Roy, S Jha… - ACM Transactions on … - dl.acm.org
Quantitative characterizations and estimations of uncertainty are of fundamental importance
for machine learning classification, particularly in safety-critical settings where continuous …