Y Gao, W Zhao, S Eger - arXiv preprint arXiv:2005.03724, 2020 - arxiv.org
We study unsupervised multi-document summarization evaluation metrics, which require neither human-written reference summaries nor human annotations (eg preferences …
We provide a literature review about Automatic Text Summarization (ATS) systems. We consider a citation-based approach. We start with some popular and well-known papers that …
A desirable property of a reference-based evaluation metric that measures the content quality of a summary is that it should estimate how much information that summary has in …
Different from general documents, it is recognised that the ease with which people can understand a biomedical text is eminently varied, owing to the highly technical nature of …
S Zhang, M Bansal - arXiv preprint arXiv:2109.11503, 2021 - arxiv.org
Human evaluation for summarization tasks is reliable but brings in issues of reproducibility and high costs. Automatic metrics are cheap and reproducible but sometimes poorly …
D Deutsch, D Roth - Proceedings of the 25th Conference on …, 2021 - aclanthology.org
Reference-based metrics such as ROUGE or BERTScore evaluate the content quality of a summary by comparing the summary to a reference. Ideally, this comparison should …
Large Language Models (LLMs) have enabled new ways to satisfy information needs. Although great strides have been made in applying them to settings like document ranking …
D Deutsch, D Roth - arXiv preprint arXiv:2007.05374, 2020 - arxiv.org
We present SacreROUGE, an open-source library for using and developing summarization evaluation metrics. SacreROUGE removes many obstacles that researchers face when …
One of the main challenges in the development of summarization tools is summarization quality evaluation. On the one hand, the human assessment of summarization quality …