A survey of language model confidence estimation and calibration

J Geng, F Cai, Y Wang, H Koeppl, P Nakov… - arXiv preprint arXiv …, 2023 - arxiv.org
Language models (LMs) have demonstrated remarkable capabilities across a wide range of
tasks in various domains. Despite their impressive performance, the reliability of their output …

Uncertainty in natural language processing: Sources, quantification, and applications

M Hu, Z Zhang, S Zhao, M Huang, B Wu - arXiv preprint arXiv:2306.04459, 2023 - arxiv.org
As a main field of artificial intelligence, natural language processing (NLP) has achieved
remarkable success via deep neural networks. Plenty of NLP tasks have been addressed in …

Shifting attention to relevance: Towards the uncertainty estimation of large language models

J Duan, H Cheng, S Wang, C Wang, A Zavalny… - arXiv preprint arXiv …, 2023 - arxiv.org
Although Large Language Models (LLMs) have shown great potential in Natural Language
Generation, it is still challenging to characterize the uncertainty of model generations, ie …

Hybrid uncertainty quantification for selective text classification in ambiguous tasks

A Vazhentsev, G Kuzmin, A Tsvigun… - Proceedings of the …, 2023 - aclanthology.org
Many text classification tasks are inherently ambiguous, which results in automatic systems
having a high risk of making mistakes, in spite of using advanced machine learning models …

Shifting attention to relevance: Towards the predictive uncertainty quantification of free-form large language models

J Duan, H Cheng, S Wang, A Zavalny… - Proceedings of the …, 2024 - aclanthology.org
Abstract Large Language Models (LLMs) show promising results in language generation
and instruction following but frequently “hallucinate”, making their outputs less reliable …

LM-polygraph: Uncertainty estimation for language models

E Fadeeva, R Vashurin, A Tsvigun… - arXiv preprint arXiv …, 2023 - arxiv.org
Recent advancements in the capabilities of large language models (LLMs) have paved the
way for a myriad of groundbreaking applications in various fields. However, a significant …

Word-sequence entropy: Towards uncertainty estimation in free-form medical question answering applications and beyond

Z Wang, J Duan, C Yuan, Q Chen, T Chen… - … Applications of Artificial …, 2025 - Elsevier
Uncertainty estimation is crucial for the reliability of safety-critical human and artificial
intelligence (AI) interaction systems, particularly in the domain of healthcare engineering …

Uncertainty-aware unlikelihood learning improves generative aspect sentiment quad prediction

M Hu, Y Bai, Y Wu, Z Zhang, L Zhang, H Gao… - arXiv preprint arXiv …, 2023 - arxiv.org
Recently, aspect sentiment quad prediction has received widespread attention in the field of
aspect-based sentiment analysis. Existing studies extract quadruplets via pre-trained …

A Survey of Confidence Estimation and Calibration in Large Language Models

J Geng, F Cai, Y Wang, H Koeppl… - Proceedings of the …, 2024 - aclanthology.org
Large language models (LLMs) have demonstrated remarkable capabilities across a wide
range of tasks in various domains. Despite their impressive performance, they can be …

Test optimization in DNN testing: a survey

Q Hu, Y Guo, X Xie, M Cordy, L Ma… - ACM Transactions on …, 2024 - dl.acm.org
This article presents a comprehensive survey on test optimization in deep neural network
(DNN) testing. Here, test optimization refers to testing with low data labeling effort. We …