Trustllm: Trustworthiness in large language models

Y Huang, L Sun, H Wang, S Wu, Q Zhang, Y Li… - arXiv preprint arXiv …, 2024 - arxiv.org
Large language models (LLMs), exemplified by ChatGPT, have gained considerable
attention for their excellent natural language processing capabilities. Nonetheless, these …

[HTML][HTML] Position: TrustLLM: Trustworthiness in large language models

Y Huang, L Sun, H Wang, S Wu… - International …, 2024 - proceedings.mlr.press
Large language models (LLMs) have gained considerable attention for their excellent
natural language processing capabilities. Nonetheless, these LLMs present many …

Applicability of large language models and generative models for legal case judgement summarization

A Deroy, K Ghosh, S Ghosh - Artificial Intelligence and Law, 2024 - Springer
Automatic summarization of legal case judgements, which are known to be long and
complex, has traditionally been tried via extractive summarization models. In recent years …

Raptor: Recursive abstractive processing for tree-organized retrieval

P Sarthi, S Abdullah, A Tuli, S Khanna, A Goldie… - arXiv preprint arXiv …, 2024 - arxiv.org
Retrieval-augmented language models can better adapt to changes in world state and
incorporate long-tail knowledge. However, most existing methods retrieve only short …

A meta-evaluation of faithfulness metrics for long-form hospital-course summarization

G Adams, J Zuckerg, N Elhadad - Machine Learning for …, 2023 - proceedings.mlr.press
Long-form clinical summarization of hospital admissions has real-world significance
because of its potential to help both clinicians and patients. The factual consistency of …

An investigation of evaluation methods in automatic medical note generation

AB Abacha, W Yim, G Michalopoulos… - Findings of the …, 2023 - aclanthology.org
Recent studies on automatic note generation have shown that doctors can save significant
amounts of time when using automatic clinical note generation (Knoll et al., 2022) …

Docfinqa: A long-context financial reasoning dataset

V Reddy, R Koncel-Kedziorski, VD Lai… - arXiv preprint arXiv …, 2024 - arxiv.org
For large language models (LLMs) to be effective in the financial domain--where each
decision can have a significant impact--it is necessary to investigate realistic tasks and data …

Aligning factual consistency for clinical studies summarization through reinforcement learning

X Tang, A Cohan, M Gerstein - Proceedings of the 5th Clinical …, 2023 - aclanthology.org
In the rapidly evolving landscape of medical research, accurate and concise summarization
of clinical studies is crucial to support evidence-based practice. This paper presents a novel …

An investigation of evaluation metrics for automated medical note generation

AB Abacha, W Yim, G Michalopoulos, T Lin - arXiv preprint arXiv …, 2023 - arxiv.org
Recent studies on automatic note generation have shown that doctors can save significant
amounts of time when using automatic clinical note generation (Knoll et al., 2022) …

[HTML][HTML] What are the desired characteristics of calibration sets? identifying correlates on long form scientific summarization

G Adams, BH Nguyen, J Smith, Y Xia… - Proceedings of the …, 2023 - ncbi.nlm.nih.gov
Summarization models often generate text that is poorly calibrated to quality metrics
because they are trained to maximize the likelihood of a single reference (MLE). To address …