Understanding the societal impacts of machine translation: a critical review of the literature on medical and legal use cases

LN Vieira, M O'Hagan, C O'Sullivan - … , Communication & Society, 2021 - Taylor & Francis
The ready availability of machine translation (MT) systems such as Google Translate has
profoundly changed how society engages with multilingual communication practices. In …

The Eval4NLP shared task on explainable quality estimation: Overview and results

M Fomicheva, P Lertvittayakumjorn, W Zhao… - arXiv preprint arXiv …, 2021 - arxiv.org
In this paper, we introduce the Eval4NLP-2021shared task on explainable quality
estimation. Given a source-translation pair, this shared task requires not only to provide a …

CometKiwi: IST-unbabel 2022 submission for the quality estimation shared task

R Rei, M Treviso, NM Guerreiro, C Zerva… - arXiv preprint arXiv …, 2022 - arxiv.org
We present the joint contribution of IST and Unbabel to the WMT 2022 Shared Task on
Quality Estimation (QE). Our team participated on all three subtasks:(i) Sentence and Word …

Data-driven sentence simplification: Survey and benchmark

F Alva-Manchego, C Scarton, L Specia - Computational Linguistics, 2020 - direct.mit.edu
Sentence Simplification (SS) aims to modify a sentence in order to make it easier to read
and understand. In order to do so, several rewriting transformations can be performed such …

OpenKiwi: An open source framework for quality estimation

F Kepler, J Trénous, M Treviso, M Vera… - arXiv preprint arXiv …, 2019 - arxiv.org
We introduce OpenKiwi, a PyTorch-based open source framework for translation quality
estimation. OpenKiwi supports training and testing of word-level and sentence-level quality …

Machine translation decoding beyond beam search

R Leblond, JB Alayrac, L Sifre, M Pislar… - arXiv preprint arXiv …, 2021 - arxiv.org
Beam search is the go-to method for decoding auto-regressive machine translation models.
While it yields consistent improvements in terms of BLEU, it is only concerned with finding …

Infolm: A new metric to evaluate summarization & data2text generation

PJA Colombo, C Clavel, P Piantanida - Proceedings of the AAAI …, 2022 - ojs.aaai.org
Assessing the quality of natural language generation (NLG) systems through human
annotation is very expensive. Additionally, human annotation campaigns are time …

MLQE-PE: A multilingual quality estimation and post-editing dataset

M Fomicheva, S Sun, E Fonseca, C Zerva… - arXiv preprint arXiv …, 2020 - arxiv.org
We present MLQE-PE, a new dataset for Machine Translation (MT) Quality Estimation (QE)
and Automatic Post-Editing (APE). The dataset contains eleven language pairs, with human …

Towards explainable evaluation metrics for machine translation

C Leiter, P Lertvittayakumjorn, M Fomicheva… - Journal of Machine …, 2024 - jmlr.org
Unlike classical lexical overlap metrics such as BLEU, most current evaluation metrics for
machine translation (for example, COMET or BERTScore) are based on black-box large …

Denoising pre-training for machine translation quality estimation with curriculum learning

X Geng, Y Zhang, J Li, S Huang, H Yang… - Proceedings of the …, 2023 - ojs.aaai.org
Quality estimation (QE) aims to assess the quality of machine translations when reference
translations are unavailable. QE plays a crucial role in many real-world applications of …