Dancing between success and failure: Edit-level simplification evaluation using SALSA

D Heineman, Y Dou, M Maddela, W Xu - arXiv preprint arXiv:2305.14458, 2023 - arxiv.org
Large language models (eg, GPT-4) are uniquely capable of producing highly rated text
simplification, yet current human evaluation methods fail to provide a clear understanding of …

Reference matters: Benchmarking factual error correction for dialogue summarization with fine-grained evaluation framework

M Gao, X Wan, J Su, Z Wang, B Huai - arXiv preprint arXiv:2306.05119, 2023 - arxiv.org
Factuality is important to dialogue summarization. Factual error correction (FEC) of model-
generated summaries is one way to improve factuality. Current FEC evaluation that relies on …

Improving Factual Error Correction by Learning to Inject Factual Errors

X He, Q Zhang, AL Jin, J Ma, Y Yuan… - Proceedings of the AAAI …, 2024 - ojs.aaai.org
Factual error correction (FEC) aims to revise factual errors in false claims with minimal
editing, making them faithful to the provided evidence. This task is crucial for alleviating the …

The student becomes the master: Outperforming GPT3 on Scientific Factual Error Correction

D Ashok, A Kulkarni, H Pham… - Findings of the …, 2023 - aclanthology.org
Due to the prohibitively high cost of creating error correction datasets, most Factual Claim
Correction methods rely on a powerful verification model to guide the correction process …

SciFix: Outperforming GPT3 on Scientific Factual Error Correction

D Ashok, A Kulkarni, H Pham, B Póczos - arXiv preprint arXiv:2305.14707, 2023 - arxiv.org
Due to the prohibitively high cost of creating error correction datasets, most Factual Claim
Correction methods rely on a powerful verification model to guide the correction process …

ciix: Outperforming GPT3 on Scientific Factual Error Correction

D Ashok, A Kulkarni, H Pham, B Poczos - NeurIPS 2023 Workshop on … - openreview.net
Due to the prohibitively high cost of creating error correction datasets, most Factual Claim
Correction methods rely on a powerful verification model to guide the correction process …