A survey of controllable text generation using transformer-based pre-trained language models

H Zhang, H Song, S Li, M Zhou, D Song - ACM Computing Surveys, 2023 - dl.acm.org
Controllable Text Generation (CTG) is an emerging area in the field of natural language
generation (NLG). It is regarded as crucial for the development of advanced text generation …

Repairing the cracked foundation: A survey of obstacles in evaluation practices for generated text

S Gehrmann, E Clark, T Sellam - Journal of Artificial Intelligence Research, 2023 - jair.org
Abstract Evaluation practices in natural language generation (NLG) have many known flaws,
but improved evaluation approaches are rarely widely adopted. This issue has become …

Neural text summarization: A critical evaluation

W Kryściński, NS Keskar, B McCann, C Xiong… - arXiv preprint arXiv …, 2019 - arxiv.org
Text summarization aims at compressing long documents into a shorter form that conveys
the most important parts of the original document. Despite increased interest in the …

Twenty years of confusion in human evaluation: NLG needs evaluation sheets and standardised definitions

DM Howcroft, A Belz, M Clinciu… - 13th International …, 2020 - researchportal.hw.ac.uk
Human assessment remains the most trusted form of evaluation in NLG, but highly diverse
approaches and a proliferation of different quality criteria used by researchers make it …

Why we need new evaluation metrics for NLG

J Novikova, O Dušek, AC Curry, V Rieser - arXiv preprint arXiv …, 2017 - arxiv.org
The majority of NLG evaluation relies on automatic metrics, such as BLEU. In this paper, we
motivate the need for novel, system-and data-independent automatic evaluation methods …

[HTML][HTML] Human evaluation of automatically generated text: Current trends and best practice guidelines

C van der Lee, A Gatt, E van Miltenburg… - Computer Speech & …, 2021 - Elsevier
Currently, there is little agreement as to how Natural Language Generation (NLG) systems
should be evaluated, with a particularly high degree of variation in the way that human …

[HTML][HTML] Evaluating the state-of-the-art of end-to-end natural language generation: The e2e nlg challenge

O Dušek, J Novikova, V Rieser - Computer Speech & Language, 2020 - Elsevier
This paper provides a comprehensive analysis of the first shared task on End-to-End Natural
Language Generation (NLG) and identifies avenues for future research based on the results …

RankME: Reliable human ratings for natural language generation

J Novikova, O Dušek, V Rieser - arXiv preprint arXiv:1803.05928, 2018 - arxiv.org
Human evaluation for natural language generation (NLG) often suffers from inconsistent
user ratings. While previous research tends to attribute this problem to individual user …

A study of automatic metrics for the evaluation of natural language explanations

M Clinciu, A Eshghi, H Hastie - arXiv preprint arXiv:2103.08545, 2021 - arxiv.org
As transparency becomes key for robotics and AI, it will be necessary to evaluate the
methods through which transparency is provided, including automatically generated natural …

Chain of explanation: New prompting method to generate quality natural language explanation for implicit hate speech

F Huang, H Kwak, J An - Companion Proceedings of the ACM Web …, 2023 - dl.acm.org
Recent studies have exploited advanced generative language models to generate Natural
Language Explanations (NLE) for why a certain text could be hateful. We propose the Chain …