Just ask for calibration: Strategies for eliciting calibrated confidence scores from language...

Y Chang, X Wang, J Wang, Y Wu, L Yang… - ACM Transactions on …, 2024 - dl.acm.org

Large language models (LLMs) are gaining increasing popularity in both academia and
industry, owing to their unprecedented performance in various applications. As LLMs …

被引用次数：1089 相关文章所有 4 个版本

From black boxes to actionable insights: a perspective on explainable artificial intelligence for scientific discovery

Z Wu, J Chen, Y Li, Y Deng, H Zhao… - Journal of Chemical …, 2023 - ACS Publications

The application of Explainable Artificial Intelligence (XAI) in the field of chemistry has
garnered growing interest for its potential to justify the prediction of black-box machine …

被引用次数：6 相关文章所有 3 个版本

[PDF] arxiv.org

Can llms express their uncertainty? an empirical evaluation of confidence elicitation in llms

M Xiong, Z Hu, X Lu, Y Li, J Fu, J He, B Hooi - arXiv preprint arXiv …, 2023 - arxiv.org

The task of empowering large language models (LLMs) to accurately express their
confidence, referred to as confidence elicitation, is essential in ensuring reliable and …

被引用次数：136 相关文章所有 3 个版本

[PDF] arxiv.org

Evaluation and analysis of hallucination in large vision-language models

J Wang, Y Zhou, G Xu, P Shi, C Zhao, H Xu… - arXiv preprint arXiv …, 2023 - arxiv.org

Large Vision-Language Models (LVLMs) have recently achieved remarkable success.
However, LVLMs are still plagued by the hallucination problem, which limits the practicality …

被引用次数：65 相关文章所有 2 个版本

[PDF] arxiv.org

Label-free node classification on graphs with large language models (llms)

Z Chen, H Mao, H Wen, H Han, W Jin, H Zhang… - arXiv preprint arXiv …, 2023 - arxiv.org

In recent years, there have been remarkable advancements in node classification achieved
by Graph Neural Networks (GNNs). However, they necessitate abundant high-quality labels …

被引用次数：38 相关文章所有 3 个版本

[PDF] arxiv.org

Alignment for honesty

Y Yang, E Chern, X Qiu, G Neubig, P Liu - arXiv preprint arXiv:2312.07000, 2023 - arxiv.org

Recent research has made significant strides in applying alignment techniques to enhance
the helpfulness and harmlessness of large language models (LLMs) in accordance with …

被引用次数：29 相关文章所有 2 个版本

[PDF] acm.org

" I'm Not Sure, But...": Examining the Impact of Large Language Models' Uncertainty Expression on User Reliance and Trust

SSY Kim, QV Liao, M Vorvoreanu, S Ballard… - The 2024 ACM …, 2024 - dl.acm.org

Widely deployed large language models (LLMs) can produce convincing yet incorrect
outputs, potentially misleading users who may rely on them as if they were correct. To …

被引用次数：10 相关文章所有 5 个版本

[PDF] arxiv.org

An emulator for fine-tuning large language models using small language models

E Mitchell, R Rafailov, A Sharma, C Finn… - arXiv preprint arXiv …, 2023 - arxiv.org

Widely used language models (LMs) are typically built by scaling up a two-stage training
pipeline: a pre-training stage that uses a very large, diverse dataset of text and a fine-tuning …

被引用次数：19 相关文章所有 5 个版本

[PDF] arxiv.org

Decomposing uncertainty for large language models through input clarification ensembling

B Hou, Y Liu, K Qian, J Andreas, S Chang… - arXiv preprint arXiv …, 2023 - arxiv.org

Uncertainty decomposition refers to the task of decomposing the total uncertainty of a model
into data (aleatoric) uncertainty, resulting from the inherent complexity or ambiguity of the …

被引用次数：14 相关文章所有 4 个版本

[PDF] arxiv.org

Quantifying uncertainty in answers from any language model via intrinsic and extrinsic confidence assessment

J Chen, J Mueller - arXiv preprint arXiv:2308.16175, 2023 - arxiv.org

We introduce BSDetector, a method for detecting bad and speculative answers from a
pretrained Large Language Model by estimating a numeric confidence score for any output …

被引用次数：20 相关文章所有 2 个版本

高级搜索

QQ 群