Extractive is not faithful: An investigation of broad unfaithfulness problems in extractive...

Y Huang, L Sun, H Wang, S Wu, Q Zhang, Y Li… - arXiv preprint arXiv …, 2024 - arxiv.org

Large language models (LLMs), exemplified by ChatGPT, have gained considerable
attention for their excellent natural language processing capabilities. Nonetheless, these …

被引用次数：243 相关文章所有 4 个版本

[HTML] mlr.press

[HTML][HTML] Position: TrustLLM: Trustworthiness in large language models

Y Huang, L Sun, H Wang, S Wu… - International …, 2024 - proceedings.mlr.press

Large language models (LLMs) have gained considerable attention for their excellent
natural language processing capabilities. Nonetheless, these LLMs present many …

被引用次数：39 相关文章

[PDF] arxiv.org

Applicability of large language models and generative models for legal case judgement summarization

A Deroy, K Ghosh, S Ghosh - Artificial Intelligence and Law, 2024 - Springer

Automatic summarization of legal case judgements, which are known to be long and
complex, has traditionally been tried via extractive summarization models. In recent years …

被引用次数：24 相关文章所有 4 个版本

[PDF] arxiv.org

Raptor: Recursive abstractive processing for tree-organized retrieval

P Sarthi, S Abdullah, A Tuli, S Khanna, A Goldie… - arXiv preprint arXiv …, 2024 - arxiv.org

Retrieval-augmented language models can better adapt to changes in world state and
incorporate long-tail knowledge. However, most existing methods retrieve only short …

被引用次数：84 相关文章所有 3 个版本

[PDF] mlr.press

A meta-evaluation of faithfulness metrics for long-form hospital-course summarization

G Adams, J Zuckerg, N Elhadad - Machine Learning for …, 2023 - proceedings.mlr.press

Long-form clinical summarization of hospital admissions has real-world significance
because of its potential to help both clinicians and patients. The factual consistency of …

被引用次数：23 相关文章所有 6 个版本

[PDF] aclanthology.org

An investigation of evaluation methods in automatic medical note generation

AB Abacha, W Yim, G Michalopoulos… - Findings of the …, 2023 - aclanthology.org

Recent studies on automatic note generation have shown that doctors can save significant
amounts of time when using automatic clinical note generation (Knoll et al., 2022) …

被引用次数：18 相关文章所有 2 个版本

[PDF] arxiv.org

Docfinqa: A long-context financial reasoning dataset

V Reddy, R Koncel-Kedziorski, VD Lai… - arXiv preprint arXiv …, 2024 - arxiv.org

For large language models (LLMs) to be effective in the financial domain--where each
decision can have a significant impact--it is necessary to investigate realistic tasks and data …

被引用次数：9 相关文章所有 2 个版本

[PDF] aclanthology.org

Aligning factual consistency for clinical studies summarization through reinforcement learning

X Tang, A Cohan, M Gerstein - Proceedings of the 5th Clinical …, 2023 - aclanthology.org

In the rapidly evolving landscape of medical research, accurate and concise summarization
of clinical studies is crucial to support evidence-based practice. This paper presents a novel …

被引用次数：8 相关文章所有 2 个版本

[PDF] arxiv.org

An investigation of evaluation metrics for automated medical note generation

AB Abacha, W Yim, G Michalopoulos, T Lin - arXiv preprint arXiv …, 2023 - arxiv.org

Recent studies on automatic note generation have shown that doctors can save significant
amounts of time when using automatic clinical note generation (Knoll et al., 2022) …

被引用次数：17 相关文章所有 3 个版本

[HTML] nih.gov

[HTML][HTML] What are the desired characteristics of calibration sets? identifying correlates on long form scientific summarization

G Adams, BH Nguyen, J Smith, Y Xia… - Proceedings of the …, 2023 - ncbi.nlm.nih.gov

Summarization models often generate text that is poorly calibrated to quality metrics
because they are trained to maximize the likelihood of a single reference (MLE). To address …

被引用次数：6 相关文章所有 9 个版本

高级搜索

QQ 群