Human evaluation of machine translation through binary system comparisons

M Freitag, G Foster, D Grangier, V Ratnakar… - Transactions of the …, 2021 - direct.mit.edu

Human evaluation of modern high-quality machine translation systems is a difficult problem,
and there is increasing evidence that inadequate evaluation procedures can lead to …

被引用次数：288 相关文章所有 13 个版本

[PDF] cambridge.org

How to do human evaluation: A brief introduction to user studies in NLP

H Schuff, L Vanderlyn, H Adel, NT Vu - Natural Language …, 2023 - cambridge.org

Many research topics in natural language processing (NLP), such as explanation
generation, dialog modeling, or machine translation, require evaluation that goes beyond …

被引用次数：15 相关文章所有 5 个版本

Adequacy–fluency metrics: Evaluating mt in the continuous space model framework

RE Banchs, LF D'Haro, H Li - IEEE/ACM Transactions on Audio …, 2015 - ieeexplore.ieee.org

This work extends and evaluates a two-dimensional automatic evaluation metric for machine
translation, which is designed to operate at the sentence level. The metric is based on the …

被引用次数：87 相关文章所有 6 个版本

[PDF] dcu.ie

Informative manual evaluation of machine translation output

M Popović - 2020 - doras.dcu.ie

This work proposes a new method for manual evaluation of Machine Translation (MT) output
based on marking actual issues in the translated text. The novelty is that the evaluators are …

被引用次数：34 相关文章所有 4 个版本

[PDF] arxiv.org

Extrinsic evaluation of machine translation metrics

N Moghe, T Sherborne, M Steedman… - arXiv preprint arXiv …, 2022 - arxiv.org

Automatic machine translation (MT) metrics are widely used to distinguish the translation
qualities of machine translation systems across relatively large test sets (system-level …

被引用次数：10 相关文章所有 5 个版本

[PDF] stanford.edu

[PDF][PDF] Deep learning for semantic similarity

A Sanborn, J Skryzalin - CS224d: Deep Learning for Natural …, 2015 - cs224d.stanford.edu

Evaluating the semantic similarity of two sentences is a task central to automated
understanding of natural languages. We discuss the problem of semantic similarity and …

被引用次数：51 相关文章所有 2 个版本

[PDF] aclanthology.org

Agree to disagree: Analysis of inter-annotator disagreements in human evaluation of machine translation output

M Popović - Proceedings of the 25th Conference on …, 2021 - aclanthology.org

This work describes an analysis of inter-annotator disagreements in human evaluation of
machine translation output. The errors in the analysed texts were marked by multiple …

被引用次数：11 相关文章所有 4 个版本

[PDF] arxiv.org

Machine Translation with Large Language Models: Prompt Engineering for Persian, English, and Russian Directions

N Pourkamali, SE Sharifi - arXiv preprint arXiv:2401.08429, 2024 - arxiv.org

Generative large language models (LLMs) have demonstrated exceptional proficiency in
various natural language processing (NLP) tasks, including machine translation, question …

被引用次数：3 相关文章所有 3 个版本

[PDF] aclanthology.org

[PDF][PDF] Ranking vs. regression in machine translation evaluation

K Duh - Proceedings of the Third Workshop on Statistical …, 2008 - aclanthology.org

Automatic evaluation of machine translation (MT) systems is an important research topic for
the advancement of MT technology. Most automatic evaluation methods proposed to date …

被引用次数：61 相关文章所有 17 个版本

[PDF] arxiv.org

Affective decoding for empathetic response generation

C Zeng, G Chen, C Lin, R Li, Z Chen - arXiv preprint arXiv:2108.08102, 2021 - arxiv.org

Understanding speaker's feelings and producing appropriate responses with emotion
connection is a key communicative skill for empathetic dialogue systems. In this paper, we …

被引用次数：9 相关文章所有 7 个版本

高级搜索

QQ 群