M Huang, X Zhu, J Gao - ACM Transactions on Information Systems …, 2020 - dl.acm.org
There is a resurgent interest in developing intelligent open-domain dialog systems due to the availability of large amounts of conversational data and the recent progress on neural …
Current developments in large language models (LLMs) have enabled impressive zero-shot capabilities across various natural language tasks. An interesting application of these …
A Lopez - ACM Computing Surveys (CSUR), 2008 - dl.acm.org
Statistical machine translation (SMT) treats the translation of natural language as a machine learning problem. By examining many samples of human-produced translation, SMT …
E Chatzikoumi - Natural Language Engineering, 2020 - cambridge.org
This article presents the most up-to-date, influential automated, semiautomated and human metrics used to evaluate the quality of machine translation (MT) output and provides the …
A new metric\texttt {BaryScore} to evaluate text generation based on deep contextualized embeddings eg, BERT, Roberta, ELMo) is introduced. This metric is motivated by a new …
The evaluation of machine-generated image captions poses an interesting yet persistent challenge. Effective evaluation measures must consider numerous dimensions of similarity …
This paper describes a new evaluation metric, TER-Plus (TERp) for automatic evaluation of machine translation (MT). TERp is an extension of Translation Edit Rate (TER). It builds on …
Large language models (LLMs) have shown promise in accelerating radiology reporting by summarizing clinical findings into impressions. However, automatic impression generation …
Assessing the quality of candidate translations involves diverse linguistic facets. However, most automatic evaluation methods in use today rely on limited quality assumptions, such as …