Automated pyramid summarization evaluation

C Ma, WE Zhang, M Guo, H Wang, QZ Sheng - ACM Computing Surveys, 2022 - dl.acm.org

Multi-document summarization (MDS) is an effective tool for information aggregation that
generates an informative and concise summary from a cluster of topic-related documents …

被引用次数：136 相关文章所有 9 个版本

[PDF] arxiv.org

SUPERT: Towards new frontiers in unsupervised evaluation metrics for multi-document summarization

Y Gao, W Zhao, S Eger - arXiv preprint arXiv:2005.03724, 2020 - arxiv.org

We study unsupervised multi-document summarization evaluation metrics, which require
neither human-written reference summaries nor human annotations (eg preferences …

被引用次数：148 相关文章所有 5 个版本

[PDF] arxiv.org

A comprehensive review of automatic text summarization techniques: method, data, evaluation and coding

DO Cajueiro, AG Nery, I Tavares, MK De Melo… - arXiv preprint arXiv …, 2023 - arxiv.org

We provide a literature review about Automatic Text Summarization (ATS) systems. We
consider a citation-based approach. We start with some popular and well-known papers that …

被引用次数：14 相关文章所有 2 个版本

[PDF] mit.edu

Towards question-answering as an automatic metric for evaluating the content quality of a summary

D Deutsch, T Bedrax-Weiss, D Roth - Transactions of the Association …, 2021 - direct.mit.edu

A desirable property of a reference-based evaluation metric that measures the content
quality of a summary is that it should estimate how much information that summary has in …

被引用次数：94 相关文章所有 10 个版本

[PDF] arxiv.org

Readability controllable biomedical document summarization

Z Luo, Q Xie, S Ananiadou - arXiv preprint arXiv:2210.04705, 2022 - arxiv.org

Different from general documents, it is recognised that the ease with which people can
understand a biomedical text is eminently varied, owing to the highly technical nature of …

被引用次数：35 相关文章所有 3 个版本

[PDF] arxiv.org

Finding a balanced degree of automation for summary evaluation

S Zhang, M Bansal - arXiv preprint arXiv:2109.11503, 2021 - arxiv.org

Human evaluation for summarization tasks is reliable but brings in issues of reproducibility
and high costs. Automatic metrics are cheap and reproducible but sometimes poorly …

被引用次数：32 相关文章所有 5 个版本

[PDF] aclanthology.org

Understanding the extent to which content quality metrics measure the information quality of summaries

D Deutsch, D Roth - Proceedings of the 25th Conference on …, 2021 - aclanthology.org

Reference-based metrics such as ROUGE or BERTScore evaluate the content quality of a
summary by comparing the summary to a reference. Ideally, this comparison should …

被引用次数：28 相关文章所有 4 个版本

[PDF] acm.org

On the evaluation of machine-generated reports

J Mayfield, E Yang, D Lawrie, S MacAvaney… - Proceedings of the 47th …, 2024 - dl.acm.org

Large Language Models (LLMs) have enabled new ways to satisfy information needs.
Although great strides have been made in applying them to settings like document ranking …

被引用次数：3 相关文章所有 3 个版本

[PDF] arxiv.org

SacreROUGE: An open-source library for using and developing summarization evaluation metrics

D Deutsch, D Roth - arXiv preprint arXiv:2007.05374, 2020 - arxiv.org

We present SacreROUGE, an open-source library for using and developing summarization
evaluation metrics. SacreROUGE removes many obstacles that researchers face when …

被引用次数：32 相关文章所有 6 个版本

[PDF] aclanthology.org

Best practices for crowd-based evaluation of German summarization: Comparing crowd, expert and automatic evaluation

N Iskender, T Polzehl, S Möller - Proceedings of the First …, 2020 - aclanthology.org

One of the main challenges in the development of summarization tools is summarization
quality evaluation. On the one hand, the human assessment of summarization quality …

被引用次数：20 相关文章所有 3 个版本

高级搜索

QQ 群