Siren's song in the AI ocean: a survey on hallucination in large language models

Y Zhang, Y Li, L Cui, D Cai, L Liu, T Fu… - arXiv preprint arXiv …, 2023 - arxiv.org
While large language models (LLMs) have demonstrated remarkable capabilities across a
range of downstream tasks, a significant concern revolves around their propensity to exhibit …

Zero-shot faithful factual error correction

KH Huang, HP Chan, H Ji - arXiv preprint arXiv:2305.07982, 2023 - arxiv.org
Faithfully correcting factual errors is critical for maintaining the integrity of textual knowledge
bases and preventing hallucinations in sequence-to-sequence models. Drawing on humans' …

On improving summarization factual consistency from natural language feedback

Y Liu, B Deb, M Teruel, A Halfaker, D Radev… - arXiv preprint arXiv …, 2022 - arxiv.org
Despite the recent progress in language generation models, their outputs may not always
meet user expectations. In this work, we study whether informational feedback in natural …

Do lvlms understand charts? analyzing and correcting factual errors in chart captioning

KH Huang, M Zhou, HP Chan, YR Fung… - arXiv preprint arXiv …, 2023 - arxiv.org
Recent advancements in large vision-language models (LVLMs) have led to significant
progress in generating natural language descriptions for visual content and thus enhancing …

Learning to refine with fine-grained natural language feedback

M Wadhwa, X Zhao, JJ Li, G Durrett - arXiv preprint arXiv:2407.02397, 2024 - arxiv.org
Recent work has explored the capability of large language models (LLMs) to identify and
correct errors in LLM-generated responses. These refinement approaches frequently …

Interpretable automatic fine-grained inconsistency detection in text summarization

HP Chan, Q Zeng, H Ji - arXiv preprint arXiv:2305.14548, 2023 - arxiv.org
Existing factual consistency evaluation approaches for text summarization provide binary
predictions and limited insights into the weakness of summarization systems. Therefore, we …

Reference matters: Benchmarking factual error correction for dialogue summarization with fine-grained evaluation framework

M Gao, X Wan, J Su, Z Wang, B Huai - arXiv preprint arXiv:2306.05119, 2023 - arxiv.org
Factuality is important to dialogue summarization. Factual error correction (FEC) of model-
generated summaries is one way to improve factuality. Current FEC evaluation that relies on …

Promoting Topic Coherence and Inter-Document Consorts in Multi-Document Summarization via Simplicial Complex and Sheaf Graph

Y Atri, A Iyer, T Chakraborty… - Proceedings of the 2023 …, 2023 - aclanthology.org
Multi-document Summarization (MDS) characterizes compressing information from multiple
source documents to its succinct summary. An ideal summary should encompass all topics …

Marvista: exploring the design of a human-AI collaborative news reading tool

XA Chen, CS Wu, L Murakhovs' ka, P Laban… - ACM Transactions on …, 2023 - dl.acm.org
We explore the design of Marvista—a human-AI collaborative tool that employs a suite of
natural language processing models to provide end-to-end support for reading online news …

SWING: Balancing coverage and faithfulness for dialogue summarization

KH Huang, S Singh, X Ma, W Xiao, F Nan… - arXiv preprint arXiv …, 2023 - arxiv.org
Missing information is a common issue of dialogue summarization where some information
in the reference summaries is not covered in the generated summaries. To address this …